Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 2 months 1 week ago

Feedback on the EMEA Red Hat Partner Conference 2019 in Prague

Wed, 2019-08-07 04:06

A few weeks ago I attended the Red Hat EMEA Partner Conference for the first time. The Red Hat Conference took place in Prague from June 25th to 27th. If you are interested in Open Source technologies and in Red Hat, feel free to read this personal feedback on trends at Red Hat and in the IT sector.

Stronger Together!

Representing dbi services at the Partner Conference in Prague was a great opportunity for us as a Red Hat Advanced Partner.

About 850 people attended this amusing event! Interactions with the Red Hat Community were very interesting and relaxed. Is it because of the Open Source atmosphere? The organization, catering, and location were great also! Many thanks to the organizers !

Also a sincere thank you to all Swiss Red Hat and Tech Data contacts at the event for welcoming and assisting Swiss Partners during the 3 days. Everything went extremely professional thanks to Leonard Bodmer (Country Manager Red Hat Switzerland), Richard Zobrist (Head of Partner & Alliances Red Hat), Maja Zurovec (Account Manager Partner & Alliances Red Hat), Sandra Maria Sigrist (Tech Data), and Daria Stempkowski (Tech Data). Many thanks to all of you, also for the warm and relaxed evening at Villa Richter at the foot of Prague’s castle !

We are Stronger Together!

All about automation, integration, hybrid cloud, and multi-cloud

With this 3 days Partner Conference, Red Hat proposed a bride agenda of Breakouts, Exams, Hackathon, Keynotes, Labs, and an Open Innovation Lab. I mostly attended sessions where Red Hat partners and customers had the opportunity to give feedbacks on their experience with Red Hat products. Some of the sessions and keynotes were remarkable.

Red Hat Middleware Roadmap

The “Red Hat Middleware Roadmap” sessions (Part 1 & 2) with Rich Sharpels were a good opportunity to learn more about productivity (automation, integration, runtimes), reliability (security, reduced complexity), and flexibility (containers for scaling, cloud, hybrid-cloud, multi-cloud) with OpenShift. With these 2 presentations you also got informed on the iPaaS which is a new private integration Platform-as-a-Service offering to provide cloud-based services for application integration and messaging. The goal is here to strengthen collaboration within the business teams (devOps) thanks to Managed Integration + OpenShift Dedicated. Rich Sharpels summarizes the benefits of the iPaaS with: “cloud services and packages where customers don’t have to administrate anything!”

Ansible Partner Enablement Offerings

Günter Herold from Red Hat and Daniel Knözinger from Open Networks Austria held the session “Ansible Partner Enablement Offerings”. This session was focusing on advantages of automating tasks with Ansible for reducing mistakes, errors, and complexity because “Ansible is the universal language for the whole IT team”. With Ansible, “start small and develop” .

Best Practices for Working with Red Hat Support

Who wanted to get informed on “Best Practices for Working with Red Hat Support” attended the session with Caroline Baillargeon, Leona Meeks, and Peter Jakobs from Red Hat. This presentation gave the opportunity to learn and discuss on:

  • The Customer Portal which is said to be “full of information and best practices that could be useful before opening and escalating issues”. For example, should you search for information on Ansible, have a look at this page
  • The TSAnet Connect where answers for multi IT solutions are centralized
  • The Case Management Tool for sharing the open source spirit and to be part of the community (example)
  • Tips to work efficiently with the Red Hat support:
    1. “Make sure the customer is registered on the correct time zone”
    2. “To get 7×24 support, a Premium Support subscription is needed”
    3. “in case the answer on an issue is not appropriate, use the escalation button”
    4. “contact Red Hat to get access to the trainings that also the Red Hat engineers follow for technical problem solving”

While keeping the end customer’s satisfaction in mind, this session could probably be best summarized with “why not contributing and sharing knowledge within the Red Hat community ?”

Keynotes

Keynotes mainly concentrated on “marketing topics” that aim at boosting partner engagement, but still interesting, in particular:

  • “The Killer App in digital transformation is human connection” with Margaret Dawson on collaborating within the community
  • “Open Hybrid Cloud Ecosystems: Bold Goals for Tomorrow” with Lars Herrmann on innovations that have been considered impossible. In short, “if you think it is impossible, just do it”, so develop cloud, hybrid cloud and multi-cloud opportunities…

 

On Red Hat’s and Partner’s booths at the Congress Center

Besides sessions and presentations, lots of interesting technical booths at the Congress Center Prague did promote the work of the Red Hat engineers within the Open Source community. In particular I spent some time with Romain Pelisse (Red Hat Berlin) and Robert Zahradnícek (Red Hat Brno) to let me explain how they work and what are the trends in their areas. Of course we did speak about automation and integration, and about findings that are developed within the open source community first, before getting implemented in Red Hat solutions.

Last but not least, some Red Hat partners were present with a booth to promote their activities and products during the conference, among which Tech Data and Arrows ECS which are well know at dbi services.

What to take from the Red Hat EMEA Conference? On Red Hat and the Conference

At the end of the day, the keywords from the Red Hat EMEA Conference were probably not that far from the keywords you would get from other technology conferences. Concepts and products like “automation”, “integration”, “Ansible”, or “OpenShift” are means to get companies into the cloud. But why not? The trend into the cloud is getting more and more clear as it now makes sense for lots of projects at least for Disaster Recovery, Test and Development in the cloud.

If private cloud, hybrid cloud, or multi-cloud is not the topic at Red Hat. Their solutions are agnostic. And Red Hat’s strategy is clearly based on a strong commitment to open source. It’s all about “products” (not projects), “collaboration”, “community”, and “customer success”.

On Open Source trends and strategy

Now you may ask why subscribing to Red Hat’s products and support? Sure, with Ansible and other Open Source products you can easily “start small and develop”. Therefore the community version may fit. But what if you go in production? The more and the bigger the projects will become, the more you will need support. And to subscribe will probably make sense.

Then don’t forget that Open Source is not for free. That you go for community or enterprise Open Source makes no difference, at the end you will need to invest at least in time and knowledge. And, depending on the situation, you may subscribe in products and support. If you don’t know where to start, ask dbi services for Open Source expertise.

Looking forward to reading your comments.

Cet article Feedback on the EMEA Red Hat Partner Conference 2019 in Prague est apparu en premier sur Blog dbi services.

Sparse OVM virtual disks on appliances

Mon, 2019-08-05 00:23

For some reason, you may need to sparse OVM virtual disks in an Oracle appliances. Even though that feature is present trough the OVM Manager, most of the Oracle appliances doesn’t have any OVM Manager deployed on it. Therefore if you un-sparse your virtual disk by mistake, you are on your own.

This is a note on how to sparse virtual disks which are un-sparse.

Stop I/Os on the virtual disk

First, ensure the VM using the disk is stopped:
xm shutdown {VM_NAME}

For instance:
xm shutdown exac01db01.domain.local

Sparse disk


dd if={PATH_TO_DISK_TO_BE_SPARSED} of={PATH_TO_NEW_SPARSED_DISK} conv=sparsed

For instance:

dd if=/EXAVMIMAGES/GuestImages/exac01db01.domain.local/vdisk_root_01.img \
of=/staging/vdisk_root_01.img \
conv=sparsed

Move disk to former location

After the sparsing operation finished, copy the disk back to their former location:

# Retrieve the disks path:
cat /EXAVMIMAGES/GuestImages/{VM_NAME}/vm.cfg | grep disk
# Copy each disk back to its location:
mv /staging/{DISK_NAME}.img /EXAVMIMAGES/GuestImages/{VM_NAME}/{DISK_NAME}.img

For instance:

mv /staging/vdisk_root_01.img /EXAVMIMAGES/GuestImages/exac01db01.domain.local/vdisk_root_01.img

Start back the VM

Then you can start back the VM which use the new disk:
xm create /EXAVMIMAGES/GuestImages/{VM_NAME}/vm.cfg

I hope this helps and please contact us or comment below should you need more details.

Cet article Sparse OVM virtual disks on appliances est apparu en premier sur Blog dbi services.

Alfresco Clustering – Solr6

Sat, 2019-08-03 01:00

In previous blogs, I talked about some basis and presented some possible architectures for Alfresco, I talked about the Clustering setup for the Alfresco Repository, the Alfresco Share and for ActiveMQ. I also setup an Apache HTTPD as a Load Balancer. In this one, I will talk about the last layer that I wanted to present, which is Solr and more particularly Solr6 (Alfresco Search Services) Sharding. I planned on writing a blog related to Solr Sharding Concepts & Methods to explain what it brings concretely but unfortunately, it’s not ready yet. I will try to post it in the next few weeks, if I find the time.

 

I. Solr configuration modes

So, Solr supports/provides three configuration modes:

  • Master-Slave
  • SolrCloud
  • Standalone


Master-Slave
: It’s a first specific configuration mode which is pretty old. In this one, the Master node is the only to index the content and all the Slave nodes will replicate the Master’s index. This is a first step to provide a Clustering solution with Solr, and Alfresco supports it, but this solution has some important drawbacks. For example, and contrary to an ActiveMQ Master-Slave solution, Solr cannot change the Master. Therefore, if you lose your Master, there is no indexing happening anymore and you need to manually change the configuration file on each of the remaining nodes to specify a new Master and target all the remaining Slaves nodes to use the new Master. This isn’t what I will be talking about in this blog.

SolrCloud: It’s another specific configuration mode which is a little bit more recent, introduced in Solr4 I believe. SolrCloud is a true Clustering solution using a ZooKeeper Server. It adds an additional layer on top of a Standalone Solr which is slowing it down a little bit, especially on infrastructures with a huge demand on indexing. But at some points, when you start having dozens of Solr nodes, you need a central place to organize and configure them and that’s what SolrCloud is very good at. This solution provides Fault Tolerance as well as High Availability. I’m not sure if SolrCloud could be used by Alfresco because sure SolrCloud also has Shards and its behaviour is pretty similar to a Standalone Solr but it’s not entirely working in the same way. Maybe it’s possible, however I have never seen it so far. Might be the subject of some testing later… In any cases, using a SolrCloud for Alfresco might not be that useful because it’s really easier to setup a Master-Master Solr mixed with Solr Sharding for pretty much the same benefits. So, I won’t talk about SolrCloud here either.

You guessed it, in this blog, I will only talk about Standalone Solr nodes and only using Shards. Alfresco supports Solr Shards only since the version 5.1. Before that, it wasn’t possible to use this feature, even if Solr4 provided it already. When using the two default cores (the famous “alfresco” & “archive” cores), with all Alfresco versions (all supporting Solr… So since Alfresco 4), it is possible to have a High Available Solr installation by setting up two Solr Standalone nodes and putting a Load Balancer in front of it but in this case, there is no communication between the Solr nodes so, it’s only a HA solution, nothing more.

 

In the architectures that I presented in the first blog of this series, if you remember the schema N°5 (you probably don’t but no worry, I didn’t either), I put a link between the two Solr nodes and I mentioned the following related to this architecture:
“N°5: […]. Between the two Solr nodes, I put a Clustering link, that’s in case you are using Solr Sharding. If you are using the default cores (alfresco and archive), then there is no communication between distinct Solr nodes. If you are using Solr Sharding and if you want a HA architecture, then you will have the same Shards on both Solr nodes and in this case, there will be communications between the Solr nodes, it’s not really a Clustering so to speak, that’s how Solr Sharding is working but I still used the same representation.”

 

II. Solr Shards creation

As mentioned earlier in this blog, there are real Cluster solutions with Solr but in the case of Alfresco, because of the features that Alfresco adds like the Shard Registration, there is no real need to set up complex things like that. Having just a simple Master-Master installation of Solr6 with Sharding is already a very good and strong solution to provide Fault Tolerance, High Availability, Automatic Failover, Performance improvements, aso… So how can that be setup?

First, you will need to install at least two Solr Standalone nodes. You can use exactly the same setup for all nodes and it’s also exactly the same setup to use the default cores or Solr Sharding so just do what you are always doing. For the Tracking, you will need to use the Load Balancer URL so it can target all Repository nodes, if there are several.

If you created the default cores, you can remove them easily:

[alfresco@solr_n1 ~]$ curl -v "http://localhost:8983/solr/admin/cores?action=removeCore&storeRef=workspace://SpacesStore&coreName=alfresco"
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=removeCore&storeRef=workspace://SpacesStore&coreName=alfresco HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 150
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">524</int></lst>
</response>
* Connection #0 to host localhost left intact
[alfresco@solr_n1 ~]$
[alfresco@solr_n1 ~]$ curl -v "http://localhost:8983/solr/admin/cores?action=removeCore&storeRef=archive://SpacesStore&coreName=archive"
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=removeCore&storeRef=archive://SpacesStore&coreName=archive HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 150
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">485</int></lst>
</response>
* Connection #0 to host localhost left intact
[alfresco@solr_n1 ~]$

 

A status of “0” means that it’s successful.

Once that’s done, you can then simply create the Shards. In this example, I will:

  • use the DB_ID_RANGE method
  • use two Solr nodes
  • for workspace://SpacesStore: create 2 Shards out of a maximum of 10 with a range of 20M
  • for archive://SpacesStore: create 1 Shard out of a maximum of 5 with a range of 50M

Since I will use only two Solr nodes and since I want a High Availability on each of the Shards, I will need to have them all on both nodes. With a simple loop, it’s pretty easy to create all the Shards:

[alfresco@solr_n1 ~]$ solr_host=localhost
[alfresco@solr_n1 ~]$ solr_node_id=1
[alfresco@solr_n1 ~]$ begin_range=0
[alfresco@solr_n1 ~]$ range=19999999
[alfresco@solr_n1 ~]$ total_shards=10
[alfresco@solr_n1 ~]$
[alfresco@solr_n1 ~]$ for shard_id in `seq 0 1`; do
>   end_range=$((${begin_range} + ${range}))
>   curl -v "http://${solr_host}:8983/solr/admin/cores?action=newCore&storeRef=workspace://SpacesStore&numShards=${total_shards}&numNodes=${total_shards}&nodeInstance=${solr_node_id}&template=rerank&coreName=alfresco&shardIds=${shard_id}&property.shard.method=DB_ID_RANGE&property.shard.range=${begin_range}-${end_range}&property.shard.instance=${shard_id}"
>   echo ""
>   echo "  -->  Range N°${shard_id} created with: ${begin_range}-${end_range}"
>   echo ""
>   sleep 2
>   begin_range=$((${end_range} + 1))
> done

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=newCore&storeRef=workspace://SpacesStore&numShards=10&numNodes=10&nodeInstance=1&template=rerank&coreName=alfresco&shardIds=0&property.shard.method=DB_ID_RANGE&property.shard.range=0-19999999&property.shard.instance=0 HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 182
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">254</int></lst><str name="core">alfresco-0</str>
</response>
* Connection #0 to host localhost left intact

  -->  Range N°0 created with: 0-19999999


*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=newCore&storeRef=workspace://SpacesStore&numShards=10&numNodes=10&nodeInstance=1&template=rerank&coreName=alfresco&shardIds=1&property.shard.method=DB_ID_RANGE&property.shard.range=20000000-39999999&property.shard.instance=1 HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 182
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">228</int></lst><str name="core">alfresco-1</str>
</response>
* Connection #0 to host localhost left intact

  -->  Range N°1 created with: 20000000-39999999

[alfresco@solr_n1 ~]$
[alfresco@solr_n1 ~]$ begin_range=0
[alfresco@solr_n1 ~]$ range=49999999
[alfresco@solr_n1 ~]$ total_shards=4
[alfresco@solr_n1 ~]$ for shard_id in `seq 0 0`; do
>   end_range=$((${begin_range} + ${range}))
>   curl -v "http://${solr_host}:8983/solr/admin/cores?action=newCore&storeRef=archive://SpacesStore&numShards=${total_shards}&numNodes=${total_shards}&nodeInstance=${solr_node_id}&template=rerank&coreName=archive&shardIds=${shard_id}&property.shard.method=DB_ID_RANGE&property.shard.range=${begin_range}-${end_range}&property.shard.instance=${shard_id}"
>   echo ""
>   echo "  -->  Range N°${shard_id} created with: ${begin_range}-${end_range}"
>   echo ""
>   sleep 2
>   begin_range=$((${end_range} + 1))
> done

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> GET /solr/admin/cores?action=newCore&storeRef=archive://SpacesStore&numShards=4&numNodes=4&nodeInstance=1&template=rerank&coreName=archive&shardIds=0&property.shard.method=DB_ID_RANGE&property.shard.range=0-49999999&property.shard.instance=0 HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/xml; charset=UTF-8
< Content-Length: 181
<
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">231</int></lst><str name="core">archive-0</str>
</response>
* Connection #0 to host localhost left intact

-->  Range N°0 created with: 0-49999999

[alfresco@solr_n1 ~]$

 

On the Solr node2, to create the same Shards (another Instance of each Shard) and therefore provide the expected setup, just re-execute the same commands but replacing solr_node_id=1 with solr_node_id=2. That’s all there is to do on Solr side, just creating the Shards is sufficient. On the Alfresco side, configure the Shards registration to use the Dynamic mode:

[alfresco@alf_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco-global.properties
...
# Solr Sharding
solr.useDynamicShardRegistration=true
search.solrShardRegistry.purgeOnInit=true
search.solrShardRegistry.shardInstanceTimeoutInSeconds=60
search.solrShardRegistry.maxAllowedReplicaTxCountDifference=500
...
[alfresco@alf_n1 ~]$

 

After a quick restart, all the Shard’s Instances will register themselves to Alfresco and you should see that each Shard has its two Shard’s Instances. Thanks to the constant Tracking, Alfresco knows which Shard’s Instances are healthy (up-to-date) and which ones aren’t (either lagging behind or completely silent). When performing searches, Alfresco will make a request to any of the healthy Shard’s Instances. Solr will be aware of the healthy Shard’s Instances as well and it will start the distribution of the search request to all the Shards for the parallel query. This is the communication between the Solr nodes that I mentioned earlier: it’s not really Clustering but rather query distribution between all the healthy Shard’s Instances.

 

 

Other posts of this series on Alfresco HA/Clustering:

Cet article Alfresco Clustering – Solr6 est apparu en premier sur Blog dbi services.

Alfresco Clustering – Apache HTTPD as Load Balancer

Fri, 2019-08-02 01:00

In previous blogs, I talked about some basis and presented some possible architectures for Alfresco, I talked about the Clustering setup for the Alfresco Repository, the Alfresco Share and for ActiveMQ. In this one, I will talk about the Front-end layer, but in a very particular setup because it will also act as a Load Balancer. For an Alfresco solution, you can choose the front-end that you prefer and it can just act as a front-end to protect your Alfresco back-end components, to add SSL or whatever. There is no real preferences but you will obviously need to know how to configure it. I posted a blog some years ago for Apache HTTPD as a simple front-end (here) or you can check the Alfresco documentation which now include a section for that as well but there is no official documentation for a Load Balancer setup.

In an Alfresco architecture that includes HA/Clustering you will, at some point, need a Load Balancer. From time to time, you will come across companies that do not already have a Load Balancer available and you might therefore have to provide something to fill this gap. Since you will most probably (should?) already have a front-end to protect Alfresco, why not using it as well as a Load Balancer? In this blog, I choose Apache HTTPD because that’s the front-end I’m usually using and I know it’s working fine as a LB as well.

The architectures that I described in the first blog of this series, there always were a front-end installed on each node with Alfresco Share and there were a LB above that. Here, these two boxes are actually together. There are multiple ways to set that up but I didn’t want to talk about that in my first blog because it’s not really related to Alfresco, it’s above that so it would just have multiplied the possible architectures that I wanted to present and my blog would just have been way too long. There were also no communications between the different front-end nodes because technically speaking, we aren’t going to setup Apache HTTPD as a Cluster, we only need to provide a High Availability solution.

Alright so let’s say that you don’t have a Load Balancer available and you want to use Apache HTTPD as a front-end+LB for a two-node Cluster. There are several solutions so here are two possible ways to do that from an inbound communication point of view that will still provide redundancy:

  • Setup a Round Robin DNS that points to both Apache HTTPD node1 and node2. The DNS will redirect connections to either of the two Apache HTTPD (Active/Active)
  • Setup a Failover DNS with a pretty low TimeToLive (TTL) which will point to a single Apache HTTPD node and redirect all traffic there. If this one isn’t available, it will failover to the second one (Active/Passive)

 

In both cases above, the Apache HTTPD configuration can be exactly the same, it will work. From an outbound communication point of view, Apache HTTPD will talk directly with all the Share nodes behind it. To avoid disconnection and loss of sessions in case an Apache HTTPD is going down, the solution will need to support session stickiness across all Apache HTTPD. With that, all communications coming a single browser will always be redirected to the same backend server which ensures that the sessions are still intact, even if you are losing an Apache HTTPD. I mentioned previously that there won’t be any communications between the different front-ends so this session stickiness must be based on something present inside the session (header or cookie) or inside the URL.

With Apache HTTPD, you can use the Proxy modules to provide both a front-end configuration as well as a Load Balancer but, in this blog, I will use the JK module. The JK module is provided by Apache for communications between Apache HTTPD and Apache Tomcat. It has been designed and optimized for this purpose and it also provides/supports a Load Balancer configuration.

 

I. Apache HTTPD setup for a single back-end node

For this example, I will use the package provided by Ubuntu for a simple installation. You can obviously build it from source to customize it, add your best practices, aso… This has nothing to do with the Clustering setup, it’s a simple front-end configuration for any installation. So let’s install a basic Apache HTTPD:

[alfresco@httpd_n1 ~]$ sudo apt-get install apache2 libapache2-mod-jk
[alfresco@httpd_n1 ~]$ sudo systemctl enable apache2.service
[alfresco@httpd_n1 ~]$ sudo systemctl daemon-reload
[alfresco@httpd_n1 ~]$ sudo a2enmod rewrite
[alfresco@httpd_n1 ~]$ sudo a2enmod ssl

 

Then to configure it for a single back-end Alfresco node (I’m just showing a minimal configuration again, there is much more to do add security & restrictions around Alfresco and mod_jk):

[alfresco@httpd_n1 ~]$ cat /etc/apache2/sites-available/alfresco-ssl.conf
...
<VirtualHost *:80>
    RewriteRule ^/?(.*) https://%{HTTP_HOST}/$1 [R,L]
</VirtualHost>

<VirtualHost *:443>
    ServerName            dns.domain
    ServerAlias           dns.domain dns
    ServerAdmin           email@domain
    SSLEngine             on
    SSLProtocol           -all +TLSv1.2
    SSLCipherSuite        EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH:AES2
    SSLHonorCipherOrder   on
    SSLVerifyClient       none
    SSLCertificateFile    /etc/pki/tls/certs/dns.domain.crt
    SSLCertificateKeyFile /etc/pki/tls/private/dns.domain.key

    RewriteRule ^/$ https://%{HTTP_HOST}/share [R,L]

    JkMount /* alfworker
</VirtualHost>
...
[alfresco@httpd_n1 ~]$
[alfresco@httpd_n1 ~]$ cat /etc/libapache2-mod-jk/workers.properties
worker.list=alfworker
worker.alfworker.type=ajp13
worker.alfworker.port=8009
worker.alfworker.host=share_n1.domain
worker.alfworker.lbfactor=1
[alfresco@httpd_n1 ~]$
[alfresco@httpd_n1 ~]$ sudo a2ensite alfresco-ssl
[alfresco@httpd_n1 ~]$ sudo a2dissite 000-default
[alfresco@httpd_n1 ~]$ sudo rm /etc/apache2/sites-enabled/000-default.conf
[alfresco@httpd_n1 ~]$
[alfresco@httpd_n1 ~]$ sudo service apache2 restart

 

That should do it for a single back-end Alfresco node. Again, this was just an example, I wouldn’t recommend using the configuration as is (inside the alfresco-ssl.conf file), there is much more to do for security reasons.

 

II. Adaptation for a Load Balancer configuration

If you want to configure your Apache HTTPD as a Load Balancer, then on top of the standard setup shown above, you just have to modify two things:

  • Modify the JK module configuration to use a Load Balancer
  • Modify the Apache Tomcat configuration to add an identifier for Apache HTTPD to be able to redirect the communication to the correct back-end node (session stickiness). This ID put in the Apache Tomcat configuration will extend the Session’s ID like that: <session_id>.<tomcat_id>

 

So on all the nodes hosting the Apache HTTPD, you should put the exact same configuration:

[alfresco@httpd_n1 ~]$ cat /etc/libapache2-mod-jk/workers.properties
worker.list=alfworker

worker.alfworker.type=lb
worker.alfworker.balance_workers=node1,node2
worker.alfworker.sticky_session=true
worker.alfworker.method=B

worker.node1.type=ajp13
worker.node1.port=8009
worker.node1.host=share_n1.domain
worker.node1.lbfactor=1

worker.node2.type=ajp13
worker.node2.port=8009
worker.node2.host=share_n2.domain
worker.node2.lbfactor=1
[alfresco@httpd_n1 ~]$
[alfresco@httpd_n1 ~]$ sudo service apache2 reload

 

With the above configuration, we keep the same JK Worker (alfworker) but instead of using a ajp13 type, we use a lb type (line 4) which is an encapsulation. The alfworker will use 2 sub-workers named node1 and node2 (line 5), that’s just a generic name. The alfworker will also enable stickiness and use the method B (Busyness), which means that for new sessions, Apache HTTPD to choose to use the worker with the less requests being served, divided by the lbfactor value.

Each sub-worker (node1 and node2) define their type which is ajp13 this time, the port and host it should target (where the Share nodes are located) and the lbfactor. As mentioned above, increasing the lbfactor means that more requests are going to be sent to this worker:

  • For the node2 to serve 100% more requests than the node1 (x2), then set worker.node1.lbfactor=1 and worker.node2.lbfactor=2
  • For the node2 to serve 50% more requests than the node1 (x1.5), then set worker.node1.lbfactor=2 and worker.node2.lbfactor=3

 

The second thing to do is to modify the Apache Tomcat configuration to add a specific ID. On the Share node1:

[alfresco@share_n1 ~]$ grep "<Engine" $CATALINA_HOME/conf/server.xml
    <Engine name="Catalina" defaultHost="localhost" jvmRoute="share_n1">
[alfresco@share_n1 ~]$

 

On the Share node2:

[alfresco@share_n2 ~]$ grep "<Engine" $CATALINA_HOME/conf/server.xml
    <Engine name="Catalina" defaultHost="localhost" jvmRoute="share_n2">
[alfresco@share_n2 ~]$

 

The value to be put in the jvmRoute parameter is just a string so it can be anything but it must be unique across all Share nodes so that the Apache HTTPD JK module can find the correct back-end node that it should transfer the requests to.

It’s that simple to configure Apache HTTPD as a Load Balancer in front of Alfresco… To check which back-end server you are currently using, you can use the browser’s utilities and in particular the network recording which will display, in the headers/cookies section, the Session ID which will therefore display the value that you put in the jvmRoute.

 

 

Other posts of this series on Alfresco HA/Clustering:

Cet article Alfresco Clustering – Apache HTTPD as Load Balancer est apparu en premier sur Blog dbi services.

Alfresco Clustering – ActiveMQ

Thu, 2019-08-01 01:00

In previous blogs, I talked about some basis and presented some possible architectures for Alfresco, I talked about the Clustering setup for the Alfresco Repository and the Alfresco Share. In this one, I will work on the ActiveMQ layer. I recently posted something related to the setup of ActiveMQ and some initial configuration. I will therefore extend this topic in this blog with what needs to be done to have a simple Cluster for ActiveMQ. I’m not an ActiveMQ expert, I just started using it a few months ago in relation to Alfresco but still, I learned some things in this timeframe so this might be of some use.

ActiveMQ is a Messaging Server so there are therefore three sides to this component. First, there are Producers which produce messages. These messages are put in the broker’s queue which is the second side and finally there are Consumers which consume the messages from the queue. Producers and Consumers are satellites that are using the JMS broker’s queue: they are both clients. Therefore, in a standalone architecture (one broker), there is no issue because clients will always produce and consume all messages. However, if you start adding more brokers and if you aren’t doing it right, you might have producers talking to a specific broker and consumers talking to another one. To solve that, there are a few things possible:

  • a first solution is to create a Network of Brokers which will allow the different brokers to forward the necessary messages between them. You can see that as an Active/Active Cluster
    • Pros: this allows ActiveMQ to support a huge architecture with potentially hundreds or thousands of brokers
    • Cons: messages are, at any point in time, only owned by one single broker so if this broker goes down, the message is lost (if there is no persistence) or will have to wait for the broker to be restarted (if there is persistence)
  • the second solution that ActiveMQ supports is the Master/Slave one. In this architecture, all messages will be replicated from a Master to all Slave brokers. You can see that as something like an Active/Passive Cluster
    • Pros: messages are always processed and cannot be lost. If the Master broker is going down for any reasons, one of the Slave is instantly taking its place as the new Master with all the previous messages
    • Cons: since all messages are replicated, it’s much harder to support a huge architecture

In case of a Network of Brokers, it’s possible to use either the static or dynamic discovery of brokers:

  • Static discovery: Uses the static protocol to provide a list of all URIs to be tested to discover other connections. E.g.: static:(tcp://mq_n1.domain:61616,tcp://mq_n2.domain:61616)?maxReconnectDelay=3000
  • Dynamic discovery: Uses a multicast discovery agent to check for other connections. This is done using the discoveryUri parameter in the XML configuration file

 

I. Client’s configuration

On the client’s side, using several brokers is very simple since it’s all about using the correct broker URL. To be able to connect to several brokers, you should use the Failover Transport protocol which replaced the Reliable protocol used in ActiveMQ 3. For Alfresco, this broker URL needs to be updated in the alfresco-global.properties file. This is an example for a pretty simple URL with two brokers:

[alfresco@alf_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco-global.properties
...
### ActiveMQ
messaging.broker.url=failover:(tcp://mq_n1.domain:61616,tcp://mq_n2.domain:61616)?timeout=3000&randomize=false&nested.daemon=false&nested.dynamicManagement=false
#messaging.username=
#messaging.password=
...
[alfresco@alf_n1 ~]$

 

There are a few things to note. The Failover used above is a transport layer that can be used in combination with any of the other transport methods/protocol. Here it’s used with two TCP protocol. The correct nomenclature is either:

  • failover:uri1,…,uriN
    • E.g.: failover:tcp://mq_n1.domain:61616,tcp://mq_n2.domain:61616 => the simplest broker URL for two brokers with no custom options
  • failover:uri1?URIOptions1,…,uriN?URIOptionsN
    • E.g.: failover:tcp://mq_n1.domain:61616?daemon=false&dynamicManagement=false&trace=false,tcp://mq_n2.domain:61616?daemon=false&dynamicManagement=true&trace=true => a more advanced broker URL with some custom options for each of the TCP protocol URIs
  • failover:(uri1?URIOptions1,…,uriN?URIOptionsN)?FailoverTransportOptions
    • E.g.: failover:(tcp://mq_n1.domain:61616?daemon=false&dynamicManagement=false&trace=false,tcp://mq_n2.domain:61616?daemon=false&dynamicManagement=true&trace=true)?timeout=3000&randomize=false => the same broker URL as above but, in addition, with some Failover Transport options
  • failover:(uri1,…,uriN)?FailoverTransportOptions&NestedURIOptions
    • E.g.: failover:(tcp://mq_n1.domain:61616,tcp://mq_n2.domain:61616)?timeout=3000&randomize=false&nested.daemon=false&nested.dynamicManagement=false&nested.trace=false => since ActiveMQ 5.9, it’s now possible to set the nested URIs options (here the TCP protocol options) at the end of the broker URL, they just need to be preceded by “nested.”. Nested options will apply to all URIs.

There are a lot of interesting parameters, these are some:

  • Failover Transport options:
    • backup=true: initialize and keep a second connection to another broker for faster failover
    • randomize=true: will pick a new URI for the reconnect randomly from the list of URIs
    • timeout=3000: time in ms before timeout on the send operations
    • priorityBackup=true: clients will failover to other brokers in case the “primary” broker isn’t available (that’s always the case) but it will consistently try to reconnect to the “primary” one. It is possible to specify several “primary” brokers with the priorityURIs option (comma separated list)
  • TCP Transport options:
    • daemon=false: specify that ActiveMQ isn’t running in a Spring or Web container
    • dynamicManagement=false: disabling the JMX management
    • trace=false: disabling the tracing

The full list of Failover Transport options is described here and the full list of TCP Transport options here.

II. Messaging Server’s configuration

I believe the simplest setup for Clustering in ActiveMQ is using the Master/Slave setup, that’s what I will talk about here. If you are looking for more information about the Network of Brokers, you can find that here. As mentioned previously, the idea behind the Master/Slave is to replicate somehow the messages to Slave brokers. To do that, there are three possible configurations:

  • Shared File System: use a shared file system
  • JDBC: use a Database Server
  • Replicated LevelDB Store: use a ZooKeeper Server. This has been deprecated in recent versions of ActiveMQ 5 in favour of KahaDB, which is a file-based persistence Database. Therefore, this actually is linked to the first configuration above (Shared File System)

In the scope of Alfresco, you should already have a shared file system as well as a shared Database Server for the Repository Clustering… So, it’s pretty easy to fill the prerequisites for ActiveMQ since you already have them. Of course, you can use a dedicated Shared File System or dedicated Database, that’s up to your requirements.

a. JDBC

For the JDBC configuration, you will need to change the persistenceAdapter to use the dedicated jdbcPersistenceAdapter and create the associated DataSource for your Database. ActiveMQ supports some DBs like Apache Derby, DB2, HSQL, MySQL, Oracle, PostgreSQL, SQLServer or Sybase. You will also need to add the JDBC library at the right location.

[alfresco@mq_n1 ~]$ cat $ACTIVEMQ_HOME/conf/activemq.xml
<beans
  xmlns="http://www.springframework.org/schema/beans"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
  http://activemq.apache.org/schema/core http://activemq.apache.org/schema/core/activemq-core.xsd">
  ...
  <broker xmlns="http://activemq.apache.org/schema/core" brokerName="mq_n1" dataDirectory="${activemq.data}">
    ...
    <persistenceAdapter>
      <jdbcPersistenceAdapter dataDirectory="activemq-data" dataSource="postgresql-ds"/>
    </persistenceAdapter>
    ...
  </broker>
  ...
  <bean id="postgresql-ds" class="org.postgresql.ds.PGPoolingDataSource">
    <property name="serverName" value="db_vip"/>
    <property name="databaseName" value="alfresco"/>
    <property name="portNumber" value="5432"/>
    <property name="user" value="alfresco"/>
    <property name="password" value="My+P4ssw0rd"/>
    <property name="dataSourceName" value="postgres"/>
    <property name="initialConnections" value="1"/>
    <property name="maxConnections" value="10"/>
  </bean>
  ...
</beans>
[alfresco@mq_n1 ~]$

 

b. Shared File System

The Shared File System configuration is, from my point of view, the simplest one to configure but for it to work properly, there are some things to note because you should use a shared file system that supports proper file lock. This means that:

  • you cannot use the Oracle Cluster File System (OCFS/OCFS2) because there is no cluster-aware flock or POSIX locks
  • if you are using NFS v3 or lower, you won’t have automatic failover from Master to Slave because there is no timeout and therefore the lock will never be released. You should therefore use NFS v4 instead

Additionally, you need to share the persistenceAdapter between all brokers but you cannot share the data folder completely otherwise the logs will be overwritten by all brokers (that’s bad but it’s not really an issue) and more importantly, the PID file will also be overwritten which will therefore cause issues to start/stop Slave brokers…

Therefore, configuring properly the Shared File System is all about keeping the “$ACTIVEMQ_DATA” environment variable set to the place where you want the logs and PID files to be stored (i.e. locally) and you need to overwrite the persistenceAdapter path to be on the Shared File System:

[alfresco@mq_n1 ~]$ # Root folder of the ActiveMQ binaries
[alfresco@mq_n1 ~]$ echo $ACTIVEMQ_HOME
/opt/activemq
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Location of the logs and PID file
[alfresco@mq_n1 ~]$ echo $ACTIVEMQ_DATA
/opt/activemq/data
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Location of the Shared File System
[alfresco@mq_n1 ~]$ echo $ACTIVEMQ_SHARED_DATA
/shared/file/system
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sudo systemctl stop activemq.service
[alfresco@mq_n1 ~]$ grep -A2 "<persistenceAdapter>" $ACTIVEMQ_HOME/conf/activemq.xml
    <persistenceAdapter>
      <kahaDB directory="${activemq.data}/kahadb"/>
    </persistenceAdapter>
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Put the KahaDB into the Shared File System
[alfresco@mq_n1 ~]$ sed -i "s, directory=\"[^\"]*\", directory=\"${ACTIVEMQ_SHARED_DATA}/activemq/kahadb\"," $ACTIVEMQ_HOME/conf/activemq.xml
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ grep -A2 "<persistenceAdapter>" $ACTIVEMQ_HOME/conf/activemq.xml
    <persistenceAdapter>
      <kahaDB directory="/shared/file/system/activemq/kahadb"/>
    </persistenceAdapter>
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sudo systemctl start activemq.service

 

Starting the Master ActiveMQ will display some information in the log of the node1 showing that it has started properly and it will listen for connections on the different transportConnector:

[alfresco@mq_n1 ~]$ cat $ACTIVEMQ_DATA/activemq.log
2019-07-28 11:34:37,598 | INFO  | Refreshing org.apache.activemq.xbean.XBeanBrokerFactory$1@9f116cc: startup date [Sun Jul 28 11:34:37 CEST 2019]; root of context hierarchy | org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
2019-07-28 11:34:38,289 | INFO  | Using Persistence Adapter: KahaDBPersistenceAdapter[/shared/file/system/activemq/kahadb] | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:34:38,330 | INFO  | KahaDB is version 6 | org.apache.activemq.store.kahadb.MessageDatabase | main
2019-07-28 11:34:38,351 | INFO  | PListStore:[/opt/activemq/data/mq_n1/tmp_storage] started | org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
2019-07-28 11:34:38,479 | INFO  | Apache ActiveMQ 5.15.6 (mq_n1, ID:mq_n1-36925-1564306478360-0:1) is starting | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:34:38,533 | INFO  | Listening for connections at: tcp://mq_n1:61616?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:34:38,542 | INFO  | Connector openwire started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:34:38,545 | INFO  | Listening for connections at: amqp://mq_n1:5672?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:34:38,546 | INFO  | Connector amqp started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:34:38,552 | INFO  | Listening for connections at: stomp://mq_n1:61613?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:34:38,553 | INFO  | Connector stomp started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:34:38,556 | INFO  | Listening for connections at: mqtt://mq_n1:1883?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:34:38,561 | INFO  | Connector mqtt started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:34:38,650 | WARN  | ServletContext@o.e.j.s.ServletContextHandler@11841b15{/,null,STARTING} has uncovered http methods for path: / | org.eclipse.jetty.security.SecurityHandler | main
2019-07-28 11:34:38,710 | INFO  | Listening for connections at ws://mq_n1:61614?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.ws.WSTransportServer | main
2019-07-28 11:34:38,712 | INFO  | Connector ws started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:34:38,712 | INFO  | Apache ActiveMQ 5.15.6 (mq_n1, ID:mq_n1-36925-1564306478360-0:1) started | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:34:38,714 | INFO  | For help or more information please see: http://activemq.apache.org | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:34:39,118 | INFO  | No Spring WebApplicationInitializer types detected on classpath | /admin | main
2019-07-28 11:34:39,373 | INFO  | ActiveMQ WebConsole available at http://0.0.0.0:8161/ | org.apache.activemq.web.WebConsoleStarter | main
2019-07-28 11:34:39,373 | INFO  | ActiveMQ Jolokia REST API available at http://0.0.0.0:8161/api/jolokia/ | org.apache.activemq.web.WebConsoleStarter | main
2019-07-28 11:34:39,402 | INFO  | Initializing Spring FrameworkServlet 'dispatcher' | /admin | main
2019-07-28 11:34:39,532 | INFO  | No Spring WebApplicationInitializer types detected on classpath | /api | main
2019-07-28 11:34:39,563 | INFO  | jolokia-agent: Using policy access restrictor classpath:/jolokia-access.xml | /api | main
[alfresco@mq_n1 ~]$

 

Then starting a Slave will only display the information on the node2 logs that there is already a Master running and therefore the Slave is just waiting and it’s not listening for now:

[alfresco@mq_n2 ~]$ cat $ACTIVEMQ_DATA/activemq.log
2019-07-28 11:35:53,258 | INFO  | Refreshing org.apache.activemq.xbean.XBeanBrokerFactory$1@9f116cc: startup date [Sun Jul 28 11:35:53 CEST 2019]; root of context hierarchy | org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
2019-07-28 11:35:53,986 | INFO  | Using Persistence Adapter: KahaDBPersistenceAdapter[/shared/file/system/activemq/kahadb] | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:35:53,999 | INFO  | Database /shared/file/system/activemq/kahadb/lock is locked by another server. This broker is now in slave mode waiting a lock to be acquired | org.apache.activemq.store.SharedFileLocker | main
[alfresco@mq_n2 ~]$

 

Finally stopping the Master will automatically transform the Slave into a new Master, without any human interaction. From the node2 logs:

[alfresco@mq_n2 ~]$ cat $ACTIVEMQ_DATA/activemq.log
2019-07-28 11:35:53,258 | INFO  | Refreshing org.apache.activemq.xbean.XBeanBrokerFactory$1@9f116cc: startup date [Sun Jul 28 11:35:53 CEST 2019]; root of context hierarchy | org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
2019-07-28 11:35:53,986 | INFO  | Using Persistence Adapter: KahaDBPersistenceAdapter[/shared/file/system/activemq/kahadb] | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:35:53,999 | INFO  | Database /shared/file/system/activemq/kahadb/lock is locked by another server. This broker is now in slave mode waiting a lock to be acquired | org.apache.activemq.store.SharedFileLocker | main
  # The ActiveMQ Master on node1 has been stopped here (11:37:10)
2019-07-28 11:37:11,166 | INFO  | KahaDB is version 6 | org.apache.activemq.store.kahadb.MessageDatabase | main
2019-07-28 11:37:11,187 | INFO  | PListStore:[/opt/activemq/data/mq_n2/tmp_storage] started | org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
2019-07-28 11:37:11,316 | INFO  | Apache ActiveMQ 5.15.6 (mq_n2, ID:mq_n2-41827-1564306631196-0:1) is starting | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:37:11,370 | INFO  | Listening for connections at: tcp://mq_n2:61616?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:37:11,372 | INFO  | Connector openwire started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:37:11,379 | INFO  | Listening for connections at: amqp://mq_n2:5672?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:37:11,381 | INFO  | Connector amqp started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:37:11,386 | INFO  | Listening for connections at: stomp://mq_n2:61613?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:37:11,387 | INFO  | Connector stomp started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:37:11,390 | INFO  | Listening for connections at: mqtt://mq_n2:1883?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.TransportServerThreadSupport | main
2019-07-28 11:37:11,391 | INFO  | Connector mqtt started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:37:11,485 | WARN  | ServletContext@o.e.j.s.ServletContextHandler@2cfbeac4{/,null,STARTING} has uncovered http methods for path: / | org.eclipse.jetty.security.SecurityHandler | main
2019-07-28 11:37:11,547 | INFO  | Listening for connections at ws://mq_n2:61614?maximumConnections=1000&wireFormat.maxFrameSize=104857600 | org.apache.activemq.transport.ws.WSTransportServer | main
2019-07-28 11:37:11,548 | INFO  | Connector ws started | org.apache.activemq.broker.TransportConnector | main
2019-07-28 11:37:11,556 | INFO  | Apache ActiveMQ 5.15.6 (mq_n2, ID:mq_n2-41827-1564306631196-0:1) started | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:37:11,558 | INFO  | For help or more information please see: http://activemq.apache.org | org.apache.activemq.broker.BrokerService | main
2019-07-28 11:37:11,045 | INFO  | No Spring WebApplicationInitializer types detected on classpath | /admin | main
2019-07-28 11:37:11,448 | INFO  | ActiveMQ WebConsole available at http://0.0.0.0:8161/ | org.apache.activemq.web.WebConsoleStarter | main
2019-07-28 11:37:11,448 | INFO  | ActiveMQ Jolokia REST API available at http://0.0.0.0:8161/api/jolokia/ | org.apache.activemq.web.WebConsoleStarter | main
2019-07-28 11:37:11,478 | INFO  | Initializing Spring FrameworkServlet 'dispatcher' | /admin | main
2019-07-28 11:37:11,627 | INFO  | No Spring WebApplicationInitializer types detected on classpath | /api | main
2019-07-28 11:37:11,664 | INFO  | jolokia-agent: Using policy access restrictor classpath:/jolokia-access.xml | /api | main
[alfresco@mq_n2 ~]$

 

You can of course customize ActiveMQ as per your requirements, remove some connectors, setup SSL, aso… But that’s not really the purpose of this blog.

 

 

Other posts of this series on Alfresco HA/Clustering:

Cet article Alfresco Clustering – ActiveMQ est apparu en premier sur Blog dbi services.

Windows Docker containers, when platform matters

Wed, 2019-07-31 07:58

A couple of days ago, I got a question from a customer about an issue he ran into when trying to spin up a container on Windows.

The context was as follows:

> docker container run hello-world:nanoserver
Unable to find image 'hello-world:nanoserver' locally
nanoserver: Pulling from library/hello-world
C:\Program Files\Docker\docker.exe: no matching manifest for windows/amd64 10.0.14393 in the manifest list entries.
See 'C:\Program Files\Docker\docker.exe run --help'.

 

I thought that was very interesting because it pointed out some considerations about Docker image architecture design. First, we must bear in mind that containers and the underlying host share a single kernel by design and the container’s base image must match that of the host.

Let’s first begin with containers in a Linux world because it highlights the concept of Kernel sharing between different distros. In this demo, let’s say I’m running a Linux Ubuntu server 16.04 …

$ cat /etc/os-release | grep -i version
VERSION="16.04.6 LTS (Xenial Xerus)"
VERSION_ID="16.04"
VERSION_CODENAME=xenial

 

… and let’s say I want to run a container based on Centos 6.6 …

$ docker run --rm -ti centos:6.6 cat /etc/centos-release
Unable to find image 'centos:6.6' locally
6.6: Pulling from library/centos
5dd797628260: Pull complete
Digest: sha256:32b80b90ba17ed16e9fa3430a49f53ff6de0d4c76ad8631717a1373d5921fa26
Status: Downloaded newer image for centos:6.6
CentOS release 6.6 (Final)

 

You may wonder how it is possible to run different distros between the container and the host and what’s the magic behind the scene? In fact, both the container and the host share the same Linux kernel and even if CentOS 6.6 ships with a kernel version 2.6, while Ubuntu 16.04 ships with 4.4 we usually may upgrade the kernel since it’s backward compatible. The commands below demonstrate the centos container is using the same Kernel than the host.

$ uname -r
4.4.0-142-generic
$ docker run --rm -ti centos:6.6 uname -r
4.4.0-142-generic

 

Let’s say now my docker host is running on the x64 architecture. If we look at the Centos image supported architectures on Docker hub, we notice different ones:

From the output above, we may deduce it should exist a combination of different images and tags for each available architecture and the interesting point is how does Docker pull the correct one regarding my underlying architecture? This is where manifest lists come into play and allow multi-architecture images. A manifest list contains platform segregated references to a single-platform manifest entry. We may inspect a manifest list through the docker manifest command (still in experimental mode at the moment of writing this blog post).

For example, if I want to get a list of manifests and their corresponding architectures for the Centos 7, I can run docker manifest command as follows:

$ docker manifest inspect centos:7 --verbose
[
        {
                "Ref": "docker.io/library/centos:7@sha256:ca58fe458b8d94bc6e3072f1cfbd334855858e05e1fd633aa07cf7f82b048e66",
                "Descriptor": {
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "digest": "sha256:ca58fe458b8d94bc6e3072f1cfbd334855858e05e1fd633aa07cf7f82b048e66",
                        "size": 529,
                        "platform": {
                                "architecture": "amd64",
                                "os": "linux"
                        }
                },
                "SchemaV2Manifest": {
                        "schemaVersion": 2,
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "config": {
                                "mediaType": "application/vnd.docker.container.image.v1+json",
                                "size": 2182,
                                "digest": "sha256:9f38484d220fa527b1fb19747638497179500a1bed8bf0498eb788229229e6e1"
                        },
                        "layers": [
                                {
                                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                                        "size": 75403831,
                                        "digest": "sha256:8ba884070f611d31cb2c42eddb691319dc9facf5e0ec67672fcfa135181ab3df"
                                }
                        ]
                }
        },
        {
                "Ref": "docker.io/library/centos:7@sha256:9fd67116449f225c6ef60d769b5219cf3daa831c5a0a6389bbdd7c952b7b352d",
                "Descriptor": {
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "digest": "sha256:9fd67116449f225c6ef60d769b5219cf3daa831c5a0a6389bbdd7c952b7b352d",
                        "size": 529,
                        "platform": {
                                "architecture": "arm",
                                "os": "linux",
                                "variant": "v7"
                        }
                },
                "SchemaV2Manifest": {
                        "schemaVersion": 2,
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "config": {
                                "mediaType": "application/vnd.docker.container.image.v1+json",
                                "size": 2181,
                                "digest": "sha256:8c52f2d0416faa8009082cf3ebdea85b3bc1314d97925342be83bc9169178efe"
                        },
                        "layers": [
                                {
                                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                                        "size": 70029389,
                                        "digest": "sha256:193bcbf05ff9ae85ac1a58cacd9c07f8f4297dc648808c347cceb3797ae603af"
                                }
                        ]
                }
        },
        {
                "Ref": "docker.io/library/centos:7@sha256:f25f24daae92b5b5fe75bc0d5d9a3d2145906290f25aa434c43bfcefecd10dec",
                "Descriptor": {
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "digest": "sha256:f25f24daae92b5b5fe75bc0d5d9a3d2145906290f25aa434c43bfcefecd10dec",
                        "size": 529,
                        "platform": {
                                "architecture": "arm64",
                                "os": "linux",
                                "variant": "v8"
                        }
                },
                "SchemaV2Manifest": {
                        "schemaVersion": 2,
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "config": {
                                "mediaType": "application/vnd.docker.container.image.v1+json",
                                "size": 2183,
                                "digest": "sha256:7a51de8a65d533b6706fbd63beea13610e5486e49141610e553a3e784c133a37"
                        },
                        "layers": [
                                {
                                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                                        "size": 74163767,
                                        "digest": "sha256:90c48ff53512085fb5adaf9bff8f1999a39ce5e5b897f5dfe333555eb27547a7"
                                }
                        ]
                }
        },
        {
                "Ref": "docker.io/library/centos:7@sha256:1f832b4e3b9ddf67fd77831cdfb591ce5e968548a01581672e5f6b32ce1212fe",
                "Descriptor": {
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "digest": "sha256:1f832b4e3b9ddf67fd77831cdfb591ce5e968548a01581672e5f6b32ce1212fe",
                        "size": 529,
                        "platform": {
                                "architecture": "386",
                                "os": "linux"
                        }
                },
                "SchemaV2Manifest": {
                        "schemaVersion": 2,
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "config": {
                                "mediaType": "application/vnd.docker.container.image.v1+json",
                                "size": 2337,
                                "digest": "sha256:fe70670fcbec5e3b3081c6800cb531002474c36563689b450d678a34a89b62c3"
                        },
                        "layers": [
                                {
                                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                                        "size": 75654099,
                                        "digest": "sha256:39016a8400a36ce04799adba71f8678ae257d9d8dba638d81b8c5755f01fe213"
                                }
                        ]
                }
        },
        {
                "Ref": "docker.io/library/centos:7@sha256:2d9b27e9c89d511a58873254d86ecf96df0f599daae3d555d896fee9f49fedf4",
                "Descriptor": {
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "digest": "sha256:2d9b27e9c89d511a58873254d86ecf96df0f599daae3d555d896fee9f49fedf4",
                        "size": 529,
                        "platform": {
                                "architecture": "ppc64le",
                                "os": "linux"
                        }
                },
                "SchemaV2Manifest": {
                        "schemaVersion": 2,
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "config": {
                                "mediaType": "application/vnd.docker.container.image.v1+json",
                                "size": 2185,
                                "digest": "sha256:c9744f4afb966c58d227eb6ba03ab9885925f9e3314edd01d0e75481bf1c937d"
                        },
                        "layers": [
                                {
                                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                                        "size": 76787221,
                                        "digest": "sha256:deab1c539926c1ca990d5d025c6b37c649bbba025883d4b209e3b52b8fdf514a"
                                }
                        ]
                }
        }
]

 

Each manifest entry contains different information including the image signature digest, the operating system and the supported architecture. Let’s pull the Centos:7 image:

$ docker pull centos:7
7: Pulling from library/centos
8ba884070f61: Pull complete
Digest: sha256:a799dd8a2ded4a83484bbae769d97655392b3f86533ceb7dd96bbac929809f3c
Status: Downloaded newer image for centos:7
docker.io/library/centos:7

 

Let’s have a look at the unique identifier of the centos:7 image:

$ docker inspect --format='{{.Id}}' centos:7sha256:9f38484d220fa527b1fb19747638497179500a1bed8bf0498eb788229229e6e1

 

It corresponds to the SchemaV2Manifest digest value of the manifest entry related to the x64 architecture (please refer to the docker manifest inspect output above). Another official way to query manifest list and architecture is to go through the mplatform/mquery container as follows:

$ docker run mplatform/mquery centos:7
Image: centos:7
 * Manifest List: Yes
 * Supported platforms:
   - linux/amd64
   - linux/arm/v7
   - linux/arm64
   - linux/386
   - linux/ppc64le

 

However, for a Linux Centos 6.6 image (used in my first demo) the architecture support seems to be limited to  the x64 architecture:

$ docker run mplatform/mquery centos:6.6
Image: centos:6.6
 * Manifest List: Yes
 * Supported platforms:
   - linux/amd64

 

Now we are aware of manifest lists and multi-architecture images let’s go back to the initial problem. The customer ran into an platform compatibility issue when trying to spin-up a the hello-world:nanoserver container on a Windows Server 2016 Docker host. As a reminder, the error message was:

no matching manifest for windows/amd64 10.0.14393 in the manifest list entries.

In the way, that may be surprising because Windows host and containers also share a single Kernel. That’s true and it was the root cause of my customer’s issue by the way. The image he wanted to pull supports only the following Windows architecture (queried from the manifest list):

> docker run mplatform/mquery hello-world:nanoserver
Image: hello-world:nanoserver
 * Manifest List: Yes
 * Supported platforms:
   - windows/amd64:10.0.17134.885
   - windows/amd64:10.0.17763.615

 

You may notice several supported Windows platforms but with different operating system versions. Let’s have look at the Docker host version in the context of my customer:

> [System.Environment]::OSVersion.Version
Major  Minor  Build  Revision
-----  -----  -----  --------
10     0      14393  0

 

The tricky part is Windows Server 2016 comes with different branches – 1607/1709 and 1803 – which aren’t technically all the same Windows Server version. Each branch comes with a different build number. Referring to the Microsoft documentation when the build number (3rd column) is changing a new operating system version is published. What it means in that case is the OS version between the Windows Docker host and the Docker image we tried to pull are different hence we experienced this compatibility issue. However let’s precise that images and containers may run with newer versions on the host side but the opposite is not true obviously. You can refer to the same Microsoft link to get a picture of Windows container and host compatibility. 

How to fix this issue? Well, we may go two ways here. The first one consists in re-installing a Docker host platform compatible with the corresponding image. The second one consists in using an image compatible with the current architecture and referring to the hello-world image tags we have one. We may check the architecture compatibility by query the manifest file list as follows:

> docker run mplatform/mquery hello-world:nanoserver-sac2016
Image: hello-world:nanoserver-sac2016
 * Manifest List: Yes
 * Supported platforms:
   - windows/amd64:10.0.14393.2551

 

Let’s try to pull the image with the nanoserver-sac2016 tag:

> docker pull hello-world:nanoserver-sac2016
nanoserver-sac2016: Pulling from library/hello-world
bce2fbc256ea: Already exists
6f2071dcd729: Pull complete
909cdbafc9e1: Pull complete
a43e426cc5c9: Pull complete
Digest: sha256:878fd913010d26613319ec7cc83b400cb92113c314da324681d9fecfb5082edc
Status: Downloaded newer image for hello-world:nanoserver-sac2016
docker.io/library/hello-world:nanoserver-sac2016

 

Here we go!

See you!

 

 

 

 

 

 

Cet article Windows Docker containers, when platform matters est apparu en premier sur Blog dbi services.

Alfresco Clustering – Share

Wed, 2019-07-31 01:00

In previous blogs, I talked about some basis and presented some possible architectures for Alfresco and I talked about the Clustering setup for the Alfresco Repository. In this one, I will work on the Alfresco Share layer. Therefore, if you are using another client like a CMIS/REST client or an ADF Application, it won’t work that way, but you might or might not need Clustering at that layer, it depends how the Application is working.

The Alfresco Share Clustering is used only for the caches, so you could technically have multiple Share nodes working with a single Repository or a Repository Cluster without the Share Clustering. For that, you could disable the caches on the Share layer because if you kept it enabled, you would have, eventually, faced issues. Alfresco introduced a Share Clustering which is used to keep the caches in sync, so you don’t have to disable it anymore. When needed, cache invalidation messages are sent from one Share node to all others, that include runtime application properties changes as well as new/existing site/user dashboards changes.

Just like for the Repository part, it’s really easy to setup the Share Clustering so there is really no reasons not to. It’s also using Hazelcast but it’s not based on properties that you need to configure in the alfresco-global.properties (because it’s a Share configuration), this one must be done in an XML file and there is no possibilities to do that in the Alfresco Admin Console, obviously.

All Share configuration/customization are put in the “$CATALINA_HOME/shared/classes/alfresco/web-extension” folder, this one is no exception. There are two possibilities for the Share Clustering communications:

  • Multicast
  • Unicast (TCP-IP in Hazelcast)

 

I. Multicast

If you do not know how many nodes will participate in your Share Cluster or if you want to be able to add more nodes in the future without having to change the previous nodes’ configuration, then you probably want to check and opt for the Multicast option. Just create a new file “$CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml” and put this content inside it:

[alfresco@share_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:hz="http://www.hazelcast.com/schema/spring"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
                           http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
                           http://www.hazelcast.com/schema/spring
                           http://www.hazelcast.com/schema/spring/hazelcast-spring-2.4.xsd">

  <hz:topic id="topic" instance-ref="webframework.cluster.slingshot" name="share_hz_test"/>
  <hz:hazelcast id="webframework.cluster.slingshot">
    <hz:config>
      <hz:group name="slingshot" password="Sh4r3_hz_Test_pwd"/>
      <hz:network port="5801" port-auto-increment="false">
        <hz:join>
          <hz:multicast enabled="true" multicast-group="224.2.2.5" multicast-port="54327"/>
          <hz:tcp-ip enabled="false">
            <hz:members></hz:members>
          </hz:tcp-ip>
        </hz:join>
        <hz:interfaces enabled="false">
          <hz:interface></hz:interface>
        </hz:interfaces>
      </hz:network>
    </hz:config>
  </hz:hazelcast>

  <bean id="webframework.cluster.clusterservice" class="org.alfresco.web.site.ClusterTopicService" init-method="init">
    <property name="hazelcastInstance" ref="webframework.cluster.slingshot" />
    <property name="hazelcastTopicName">
      <value>share_hz_test</value>
    </property>
  </bean>

</beans>
[alfresco@share_n1 ~]$

 

In the above configuration, be sure to set a topic name (matching the hazelcastTopicName’s value) as well as a group password that is specific to this environment, so you don’t end-up with a single Cluster with members coming from different environments. For the Share layer, it’s less of an issue than for the Repository layer but still. Be sure also to use a network port that isn’t in use, it will be the port that Hazelcast will bind itself to in the local host. For Alfresco Clustering, we used 5701 so here it’s 5801 for example.

Not much more to say about this configuration, we just enabled the multicast with an IP and a port to be used and we disabled the tcp-ip one.

The interfaces is disabled by default but you can enable it, if you want to. If it’s disabled, Hazelcast will list all local interfaces (127.0.0.1, local_IP1, local_IP2, …) and it will choose one in this list. If you want to force Hazelcast to use a specific local network interface, then enable this section and add that here. In can use the following nomenclature (IP only!):

  • 10.10.10.10: Hazelcast will try to bind on 10.10.10.10 only. If it’s not available, then it won’t start
  • 10.10.10.10-11: Hazelcast will try to bind on any IP within the range 10-11 so in this case 2 IPs: 10.10.10.10 or 10.10.10.11. If you have, let’s say, 5 IPs assigned to the local host and you don’t want Hazelcast to use 3 of these, then specify the ones that it can use and it will pick one from the list. This can also be used to have the same content for the custom-slingshot-application-context.xml on different hosts… One server with IP 10.10.10.10 and a second one with IP 10.10.10.11
  • 10.10.10.* or 10.10.*.*: Hazelcast will try to bind on any IP in this range, this is an extended version of the XX-YY range above

 

For most cases, keeping the interfaces disabled is sufficient since it will just pick one available. You might think that Hazelcast may bind itself to 127.0.0.1, technically it’s possible since it’s a local network interface but I have never seen it do so, so I assume that there is some kind of preferred order if another IP is available.

Membership in Hazelcast is based on “age”, meaning that the oldest member will be the one to lead. There is no predefined Master or Slave members, they are all equal, but the oldest/first member is the one that will check if new members are allowed to join (correct config) and if so, it will send the information to all other members that joined already so they are all aligned. If multicast is enabled, a multicast listener is started to listen for new membership requests.

 

II. Unicast

If you already know how many nodes will participate in your Share Cluster or if you prefer to avoid Multicast messages (there is no real need to overload your network with such things…), then it’s preferable to use Unicast messaging. For that purpose, just create the same file as above (“$CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml“) but instead, use the tcp-ip section:

[alfresco@share_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:hz="http://www.hazelcast.com/schema/spring"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
                           http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
                           http://www.hazelcast.com/schema/spring
                           http://www.hazelcast.com/schema/spring/hazelcast-spring-2.4.xsd">

  <hz:topic id="topic" instance-ref="webframework.cluster.slingshot" name="share_hz_test"/>
  <hz:hazelcast id="webframework.cluster.slingshot">
    <hz:config>
      <hz:group name="slingshot" password="Sh4r3_hz_Test_pwd"/>
      <hz:network port="5801" port-auto-increment="false">
        <hz:join>
          <hz:multicast enabled="false" multicast-group="224.2.2.5" multicast-port="54327"/>
          <hz:tcp-ip enabled="true">
            <hz:members>share_n1.domain,share_n2.domain</hz:members>
          </hz:tcp-ip>
        </hz:join>
        <hz:interfaces enabled="false">
          <hz:interface></hz:interface>
        </hz:interfaces>
      </hz:network>
    </hz:config>
  </hz:hazelcast>

  <bean id="webframework.cluster.clusterservice" class="org.alfresco.web.site.ClusterTopicService" init-method="init">
    <property name="hazelcastInstance" ref="webframework.cluster.slingshot" />
    <property name="hazelcastTopicName">
      <value>share_hz_test</value>
    </property>
  </bean>

</beans>
[alfresco@share_n1 ~]$

 

The description is basically the same as for the Multicast part. The main difference is that the multicast was disabled, the tcp-ip was enabled and there is therefore a list of members that needs to be set. This is a comma separated list of hostname or IPs that the Hazelcast will try to contact when it starts. Membership in case of Unicast is managed in the same way except that the oldest/first member will listen for new membership requests on the TCP-IP. Therefore, it’s the same principle, it’s just done differently.

Starting the first Share node in the Cluster will display the following information on the logs:

Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Resolving domain name 'share_n1.domain' to address(es): [127.0.0.1, 10.10.10.10]
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Resolving domain name 'share_n2.domain' to address(es): [10.10.10.11]
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [share_n1.domain/10.10.10.10, share_n2.domain/10.10.10.11, share_n1.domain/127.0.0.1]
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Prefer IPv4 stack is true.
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Picked Address[share_n1.domain]:5801, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5801], bind any local is true
Jul 28, 2019 11:45:36 AM com.hazelcast.system
INFO: [share_n1.domain]:5801 [slingshot] Hazelcast Community Edition 2.4 (20121017) starting at Address[share_n1.domain]:5801
Jul 28, 2019 11:45:36 AM com.hazelcast.system
INFO: [share_n1.domain]:5801 [slingshot] Copyright (C) 2008-2012 Hazelcast.com
Jul 28, 2019 11:45:36 AM com.hazelcast.impl.LifecycleServiceImpl
INFO: [share_n1.domain]:5801 [slingshot] Address[share_n1.domain]:5801 is STARTING
Jul 28, 2019 11:45:36 AM com.hazelcast.impl.TcpIpJoiner
INFO: [share_n1.domain]:5801 [slingshot] Connecting to possible member: Address[share_n2.domain]:5801
Jul 28, 2019 11:45:36 AM com.hazelcast.nio.SocketConnector
INFO: [share_n1.domain]:5801 [slingshot] Could not connect to: share_n2.domain/10.10.10.11:5801. Reason: ConnectException[Connection refused]
Jul 28, 2019 11:45:37 AM com.hazelcast.nio.SocketConnector
INFO: [share_n1.domain]:5801 [slingshot] Could not connect to: share_n2.domain/10.10.10.11:5801. Reason: ConnectException[Connection refused]
Jul 28, 2019 11:45:37 AM com.hazelcast.impl.TcpIpJoiner
INFO: [share_n1.domain]:5801 [slingshot]

Members [1] {
        Member [share_n1.domain]:5801 this
}

Jul 28, 2019 11:45:37 AM com.hazelcast.impl.LifecycleServiceImpl
INFO: [share_n1.domain]:5801 [slingshot] Address[share_n1.domain]:5801 is STARTED
2019-07-28 11:45:37,164  INFO  [web.site.ClusterTopicService] [localhost-startStop-1] Init complete for Hazelcast cluster - listening on topic: share_hz_test

 

Then starting a second node of the Share Cluster will display the following (still on the node1 logs):

Jul 28, 2019 11:48:31 AM com.hazelcast.nio.SocketAcceptor
INFO: [share_n1.domain]:5801 [slingshot] 5801 is accepting socket connection from /10.10.10.11:34191
Jul 28, 2019 11:48:31 AM com.hazelcast.nio.ConnectionManager
INFO: [share_n1.domain]:5801 [slingshot] 5801 accepted socket connection from /10.10.10.11:34191
Jul 28, 2019 11:48:38 AM com.hazelcast.cluster.ClusterManager
INFO: [share_n1.domain]:5801 [slingshot]

Members [2] {
        Member [share_n1.domain]:5801 this
        Member [share_n2.domain]:5801
}

 

 

Other posts of this series on Alfresco HA/Clustering:

Cet article Alfresco Clustering – Share est apparu en premier sur Blog dbi services.

Java JDK12: JEP 325: Switch Expressions

Tue, 2019-07-30 02:25
Eclipse setup

In order to test Java JDK12 you will have to download Eclipse 4.11 at least. Then download JDK12 from Oracle’s web site. And configure Eclipse to use this JDK for the project. Also download the JDK12 support with the following repository link:

https://download.eclipse.org/eclipse/updates/4.11-P-builds

Edit the projects properties to use the “Java Compiler” compliance 12. And “Enable preview features” set to TRUE. Now you should be able to run JDK12 examples. Here a link for the complete setup:

https://marketplace.eclipse.org/content/java-12-support-eclipse-2019-03-411

Switch expression

The switch expression receives a new lifting in this version. It allows more readiness and more flexibility. We can now:

  • Get rid of the break word in certain cases
  • Return a value from the switch
  • Put several values in one “case”

With the following example you can see how to get rid of the “break” and use a “case L ->” switch label. In addition, the switch returns a value which can be used after the expression.

String alias = switch (day) {
    case MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY 	-> "Week day";
    case SATURDAY, SUNDAY				                -> "Weekend";
};
System.out.println(alias);

The previous few lines replaces the following ones:

switch (day) {
    case MONDAY:
        System.out.println("Week day");
        break;
    case TUESDAY:
        System.out.println("Week day");
        break;
    case WEDNESDAY:
        System.out.println("Week day");
        break;
    case THURSDAY:
        System.out.println("Week day");
        break;
    case FRIDAY:
        System.out.println("Week day");
        break;
    case SATURDAY:
        System.out.println("Weekend");
        break;
    case SUNDAY:
        System.out.println("Weekend");
        break;
}

Here the expression:

T result = switch (arg) {
    case L1 -> e1;
    case L2 -> e2;
    default -> e3;
};

When using a block of code you will still need a “break”:

int i = switch (day) {
     case MONDAY -> {
     System.out.println("Monday");
     break 0; //break is needed here to return a value if switch need a return
     }
     default -> 1;
}

Thanks to this new notation, we can now return values with a SWITCH expression. In addition it makes it more readable fo big switch expressions. Note that you can still use the “old” way, thus old code will still compile with the new JDK. For the moment it’s deployed in JDK12 which is a non-LTS release, but if the new expression is validated by the tests and results, it will definitely be part of future releases like the LTS versions and you will be able to use it then.

Cet article Java JDK12: JEP 325: Switch Expressions est apparu en premier sur Blog dbi services.

Alfresco Clustering – Repository

Tue, 2019-07-30 01:00

In a previous blog, I talked about some basis and presented some possible architectures for Alfresco. Now that this introduction has been done, let’s dig into the real blogs about how to setup a HA/Clustering Alfresco environment. In this blog in particular, I will talk about the Repository layer.

For the Repository Clustering, there are three prerequisites (and that’s all you need):

  • A valid license which include the Repository Clustering
  • A shared file system which is accessible from all Alfresco nodes in the Cluster. This is usually a NAS accessed via NFS
  • A shared database

 

Clustering the Repository part is really simple to do: you just need to put the correct properties in the alfresco-global.properties file. Of course, you could also manage it all from the Alfresco Admin Console but that’s not recommended, you should really always use the alfresco-global.properties by default. The Alfresco Repository Clustering is using Hazelcast. It was using JGroups and EHCache as well before Alfresco 4.2 but now it’s just Hazelcast. So to define an Alfresco Cluster, simply put the following configuration in the alfresco-global.properties of the Alfresco node1:

[alfresco@alf_n1 ~]$ getent hosts `hostname` | awk '{ print $1 }'
10.10.10.10
[alfresco@alf_n1 ~]$
[alfresco@alf_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco-global.properties
...
### Content Store
dir.root=/shared_storage/alf_data
...
### DB
db.username=alfresco
db.password=My+P4ssw0rd
db.name=alfresco
db.host=db_vip
## MySQL
#db.port=3306
#db.driver=com.mysql.jdbc.Driver
#db.url=jdbc:mysql://${db.host}:${db.port}/${db.name}?useUnicode=yes&characterEncoding=UTF-8
#db.pool.validate.query=SELECT 1
## PostgreSQL
db.driver=org.postgresql.Driver
db.port=5432
db.url=jdbc:postgresql://${db.host}:${db.port}/${db.name}
db.pool.validate.query=SELECT 1
## Oracle
#db.driver=oracle.jdbc.OracleDriver
#db.port=1521
#db.url=jdbc:oracle:thin:@${db.host}:${db.port}:${db.name}
#db.pool.validate.query=SELECT 1 FROM DUAL
...
### Clustering
alfresco.cluster.enabled=true
alfresco.cluster.interface=10.10.10.10-11
alfresco.cluster.nodetype=Alfresco_node1
alfresco.hazelcast.password=Alfr3sc0_hz_Test_pwd
alfresco.hazelcast.port=5701
alfresco.hazelcast.autoinc.port=false
alfresco.hazelcast.max.no.heartbeat.seconds=15
...
[alfresco@alf_n1 ~]$

 

And for the Alfresco node2, you can use the same content:

[alfresco@alf_n2 ~]$ getent hosts `hostname` | awk '{ print $1 }'
10.10.10.11
[alfresco@alf_n2 ~]$
[alfresco@alf_n2 ~]$ cat $CATALINA_HOME/shared/classes/alfresco-global.properties
...
### Content Store
dir.root=/shared_storage/alf_data
...
### DB
db.username=alfresco
db.password=My+P4ssw0rd
db.name=alfresco
db.host=db_vip
## MySQL
#db.port=3306
#db.driver=com.mysql.jdbc.Driver
#db.url=jdbc:mysql://${db.host}:${db.port}/${db.name}?useUnicode=yes&characterEncoding=UTF-8
#db.pool.validate.query=SELECT 1
## PostgreSQL
db.driver=org.postgresql.Driver
db.port=5432
db.url=jdbc:postgresql://${db.host}:${db.port}/${db.name}
db.pool.validate.query=SELECT 1
## Oracle
#db.driver=oracle.jdbc.OracleDriver
#db.port=1521
#db.url=jdbc:oracle:thin:@${db.host}:${db.port}:${db.name}
#db.pool.validate.query=SELECT 1 FROM DUAL
...
### Clustering
alfresco.cluster.enabled=true
alfresco.cluster.interface=10.10.10.10-11
alfresco.cluster.nodetype=Alfresco_node2
alfresco.hazelcast.password=Alfr3sc0_hz_Test_pwd
alfresco.hazelcast.port=5701
alfresco.hazelcast.autoinc.port=false
alfresco.hazelcast.max.no.heartbeat.seconds=15
...
[alfresco@alf_n2 ~]$

 

Description of the Clustering parameters:

  • alfresco.cluster.enabled: Whether or not you want to enable the Repository Clustering for the local Alfresco node. The default value is false. You will want to set that to true for all Repository nodes that will be used by Share or any other client. If the Repository is only used for Solr Tracking, you can leave that to false
  • alfresco.cluster.interface: This is the network interface on which Hazelcast will listen for Clustering messages. This has to be an IP, it can’t be a hostname. To keep things simple and to have the same alfresco-global.properties on all Alfresco nodes however, it is possible to use a specific nomenclature:
    • 10.10.10.10: Hazelcast will try to bind on 10.10.10.10 only. If it’s not available, then it won’t start
    • 10.10.10.10-11: Hazelcast will try to bind on any IP within the range 10-11 so in this case 2 IPs: 10.10.10.10 or 10.10.10.11. If you have, let’s say, 4 IPs assigned to the local host and you don’t want Hazelcast to use 2 of these, then specify the ones that it can use and it will pick one from the list. This can also be used to have the same content for the alfresco-global.properties on different hosts… One server with IP 10.10.10.10 and a second one with IP 10.10.10.11
    • 10.10.10.* or 10.10.*.*: Hazelcast will try to bind on any IP in this range, this is an extended version of the XX-YY range above
  • alfresco.cluster.nodetype: A human-friendly string to represent the local Alfresco node. It doesn’t have any use for Alfresco, that’s really more for you. It is for example interesting to put a specific string for Alfresco node that won’t take part in the Clustering but that are still using the same Content Store and Database (like a Repository dedicated for the Solr Tracking, as mentioned above)
  • alfresco.hazelcast.password: The password to use for the Alfresco Repository Cluster. You need to use the same password for all members of the same Cluster. You should as well try to use a different password for each Cluster that you might have if they are in the same network (DEV/TEST/PROD for example), otherwise it will get ugly
  • alfresco.hazelcast.port: The default port that will be used for Clustering messages between the different members of the Cluster
  • alfresco.hazelcast.autoinc.port: Whether or not you want to allow Hazelcast to find another free port in case the default port (“alfresco.hazelcast.port”) is currently used. It will increment the port by 1 each time. You should really set this to false and just use the default port, to have full control over the channels that Clustering communications are using otherwise it might get messy as well
  • alfresco.hazelcast.max.no.heartbeat.seconds: The maximum time in seconds allowed between two heartbeat. If there is no heartbeat in this period of time, Alfresco will assume the remote node isn’t running/available

 

As you can see above, it’s really simple to add Clustering to an Alfresco Repository. Since you can(should?) have the same set of properties (except the nodetype string maybe), then it also really simplifies the deployment… If you are familiar with other Document Management System like Documentum for example, then you understand the complexity of some of these solutions! If you compare that to Alfresco, it’s like walking on the street versus walking on the moon where you obviously first need to go to the moon… Anyway, once it’s done, the logs of the Alfresco Repository node1 will display something like that when you start it:

2019-07-20 15:14:25,401  INFO  [cluster.core.ClusteringBootstrap] [localhost-startStop-1] Cluster started, name: MainRepository-<generated_id>
2019-07-20 15:14:25,405  INFO  [cluster.core.ClusteringBootstrap] [localhost-startStop-1] Current cluster members:
  10.10.10.10:5701 (hostname: alf_n1)

 

Wait for the Repository node1 to be fully started and once done, you can start the Repository node2, it needs to be started sequentially normally. You will see on the logs of the Repository node1 that another node joined automatically the Cluster:

2019-07-20 15:15:06,528  INFO  [cluster.core.MembershipChangeLogger] [hz._hzInstance_1_MainRepository-<generated_id>.event-3] Member joined: 10.10.10.11:5701 (hostname: alf_n2)
2019-07-20 15:15:06,529  INFO  [cluster.core.MembershipChangeLogger] [hz._hzInstance_1_MainRepository-<generated_id>.event-3] Current cluster members:
  10.10.10.10:5701 (hostname: alf_n1)
  10.10.10.11:5701 (hostname: alf_n2)

 

On the logs of the Repository node2, you can see directly at the initialization of the Hazelcast Cluster that the two nodes are available.

If you don’t want to check the logs, you can see pretty much the same thing from the Alfresco Admin Console. By accessing “http(s)://<hostname>:<port>/alfresco/s/enterprise/admin/admin-clustering“, you can see currently available cluster members (online nodes), non-available cluster members (offline nodes) as well as connected non-cluster members (nodes using the same DB & Content Store but with “alfresco.cluster.enabled=false”, for example to dedicate a Repository to Solr Tracking).

Alfresco also provides a small utility to check the health of the cluster which will basically ensure that the communication between each member is successful. This utility can be accessed at “http(s)://<hostname>:<port>/alfresco/s/enterprise/admin/admin-clustering-test“. It is useful to include a quick check using this utility in a monitoring solution for example, to ensure that the cluster is healthy.

 

Cet article Alfresco Clustering – Repository est apparu en premier sur Blog dbi services.

CloudBees DevOps Playground – Hands On with JenkinsX

Mon, 2019-07-29 16:32

Last week, we had the chance to attend to the CloudBees DevOps Playground in London. The event was a presentation and a Hands-On on Jenkins X done by one of the most popular guys from the CloudBees, Gareth Evans.

Before taking an interest in Jenkins X, we focused most of our time in the Docker and Kubernetes part. We enhance a lot our skills during the last months on the administration of Kubernetes cluster and the deployment of applications, especially the Documentum stack as well WebLogic.

Jenkins X is quite a new technology in the landscape of automatic deployment, and we face the difficulties to find a workshop /training related to Jenkins X administration and usage. So we decided to go to London for this CloudBees DevOps Hands-on done.

As working in middleware infrastructures between system engineer and applications, Jenkins X is completely making sense for us to automate the creation of Kubernetes infrastructure in terms of cluster and application deployment.

What’s Jenkins X?

Basically, Jenkins X automates the whole development process end to end for containerized applications based on Docker and Kubernetes.

Overview of Jenkins X:

  • Jenkins X provides an automated CI/CD solution for Kubernetes
  • Buildpacks to quickly create new applications
  • Uses GitOps to manage promotion between Environments
  • Creates Preview Environments on Pull Requests
  • Provides control via ChatOps and feedback on Pull Requests
  • Improves developers’ productivity
  • It is open source
  • Microservices architecture
  • Designed for extension
  • Relies on k8s CRDs

JX Topologies:

Jenkins X can work in 2 modes: Static and Serverless.

Cloud-Native approach:
Our goal will be to use Jenkins X to automate the deployment of containerized applications on Kubernetes cluster.
Jenkins X make a real collaboration between system engineer and application teams with a focus on making development teams productive through automation and DevOps best practices.

We will achieve the automation of CI/CD pipelines using Jenkins X as following:

This is how Jenkins X works (big picture) and we will see later how to install JX with the different methods on the cloud or on-premise and how to build CI/CD pipelines.

Cet article CloudBees DevOps Playground – Hands On with JenkinsX est apparu en premier sur Blog dbi services.

Alfresco Clustering – Basis & Architectures

Mon, 2019-07-29 01:00

This blog will be the first of a series on Alfresco HA/Clustering topics. It’s been too long I haven’t posted anything related to Alfresco so I thought about writing a few blogs about my experience with setting up more or less complex HA/Clustering infrastructures. So, let’s start this first part with an introduction to the Alfresco HA/Clustering.

If you want to setup a HA/Cluster environment, you will have to first think about where you want to go exactly. Alfresco is composed of several components so “what do you want to achieve exactly?”, that would probably be the first question to ask.

Alfresco offers a lot of possibilities, you can more or less do whatever you want. That’s really great, but it also means that you should plan what you want to do first. Do you just want a simple HA architecture for Share+Repository but you can live without Solr for a few minutes/hours (in case of issues) or you absolutely want all components to be always available? Or maybe you want an HA architecture which is better suited for high throughput? Obviously, there might be some costs details that need to be taken into consideration linked to the resources but also the licenses: the Alfresco Clustering license itself but also the Index Engine license if you go for separated Solr Servers.

That’s what you need to define first to avoid losing time changing configurations and adding more components into the picture later. Alternatively (and that’s something I will try to cover as much as I can), it’s also possible to setup an environment which will allow you to add more components (at least some of them…) as needed without having to change your HA/Clustering configuration, if you are doing it right from the start and if you don’t change too much the architecture itself.

I mentioned earlier the components of Alfresco (Alfresco Content Services, not the company), these are the ones we are usually talking about:

  • *Front-end (Apache HTTPD, Nginx, …)
  • *ActiveMQ
  • Alfresco PDF Renderer
  • Database
  • File System
  • ImageMagick
  • Java
  • LibreOffice
  • *Share (Tomcat)
  • *Repository (Tomcat)
  • *Solr6 (Jetty)

 

In this series of blog, I won’t talk about the Alfresco PDF Renderer, ImageMagick & Java because these are just simple binaries/executables that need to be available from the Repository side. For LibreOffice, it’s usually Alfresco that is managing it directly (multi-processes, restart if crash, aso…). It wouldn’t really make sense to talk about these in blogs related to Clustering. I will also disregard the Database and File System ones since they are usually out of my scope. The Database is usually installed & managed by my colleagues which are DBAs, they are much better at that than me. That leaves us with all components with an asterisk (*). I will update this list with links to the different blogs.

Before jumping in the first component, which will be the subject of the next blog, I wanted to go through some possible architectures for Alfresco. There are a lot of schemas available on internet but it’s often the same architecture that is presented so I thought I would take some time to represent, in my own way, what the Alfresco’s architecture could look like.

In the below schemas, I represented the main components: Front-end, Share, Repository, Solr, Database & File System (Data) as little boxes. As mentioned previously, I won’t talk about the Database & File System so I just represented them once to see the communications with these but what is behind their boxes can be anything (with HA/Clustering or not). The arrows represent the way communications are initiated: an arrow in a single direction “A->B” means that B is never initiating a communication with A. Boxes that are glued together represent all components installed on the same host (a physical server, a VM, a container or whatever).

 

Alfresco Architecture 1N°1: This is the simplest architecture for Alfresco. As you can see, it’s not a HA/Clustering architecture but I decided to start small. I added a Front-end (even if it’s not mandatory) because it’s a best practice and I would not install Alfresco without it. Nothing specific to say on this architecture, it’s just simple.

 

Alfresco Architecture 2N°2: The first thing to do if you have the simplest architecture in place (N°1) and you start seeing some resources contention is to split the components and more specifically to install Solr separately. This should really be the minimal architecture to use, whenever possible.

 

Alfresco Architecture 3N°3: This is the first HA/Clustering architecture. It starts small as you can see with just two nodes for each Front-end/Share/Repository stack with a Load Balancer to dispatch the load on each side for an Active/Active solution. The dotted grey lines represent the Clustering communications. In this architecture, there is therefore a Clustering for Share and another one for the Repository layer. The Front-end doesn’t need Clustering since it just forwards the communications but the session itself is on the Tomcat (Share/Repository) side. There is only one Solr node and therefore both Repository boxes will communicate with the Solr node (through the Front-end or not). Between the Repository and Solr, there is one bidirectional arrow and another one unidirectional. That’s because both Repository boxes will initiate searches but the Solr will do tracking to index new content with only one Repository: this isn’t optimal.

 

Alfresco Architecture 4N°4: To solve this small issue with Solr tracking, we can add a second Load Balancer in between so that the Solr tracking can target any Repository node. The first bottleneck you will encounter in Alfresco is usually the Repository because a lot of things are happening in the background at that layer. Therefore, this architecture is usually the simplest HA/Clustering solution that you will want to setup.

 

Alfresco Architecture 5N°5: If you are facing some performance issues with Solr or if you want all components to be in HA, then you will have to duplicate the Solr as well. Between the two Solr nodes, I put a Clustering link, that’s in case you are using Solr Sharding. If you are using the default cores (alfresco and archive), then there is no communication between distinct Solr nodes. If you are using Solr Sharding and if you want a HA architecture, then you will have the same shards on both Solr nodes and in this case, there will be communications between the Solr nodes, it’s not really a Clustering so to speak, that’s how Solr Sharding is working but I still used the same representation.

 

Alfresco Architecture 6N°6: As mentioned previously (for the N°4), the Repository is usually the bottleneck. To reduce the load on this layer, it is possible to do several things. The first possibility is to install another Repository and dedicate it to the Solr Tracking. As you can see above, the communications aren’t bidirectional anymore but only unidirectional. Searches will come from the two Repository that are in Cluster and Solr Tracking will use the separated/dedicated Repository. This third Repository can then be set in read-only, the jobs and services can be disabled, the Clustering can be disabled as well (so it uses the same DB but it’s not part of the Clustering communications because it doesn’t have to), aso… I put this third Repository as a standalone box but obviously you can install it with one of the two Solr nodes.

 

Alfresco Architecture 7N°7: The next step can be to add another read-only Repository and put these two nodes side by side with the Solr nodes. This is to only have localhost communications for the Solr Tracking which is therefore a little bit easier to secure.

 

Alfresco Architecture 8N°8: The previous architectures (N°6 & N°7) introduced a new single point of failure so to fix this, there is only one way: add a new Load Balancer between the Solr and the Repository for the tracking. Behind the Load Balancer, there are two solutions: keep the fourth Repository which is also in read-only or use a fallback to the Repository node1/node2 in case the read-only Repository (node3) isn’t available. For that purpose, the Load Balancer should be in, respectively, Active/Active or Active/Passive. As you can see, I choose to represent the first one.

 

These were a few possible architectures. You can obviously add more nodes if you want to, to handle more load. There are many other solutions so have fun designing the best one, according to your requirements.

 

Cet article Alfresco Clustering – Basis & Architectures est apparu en premier sur Blog dbi services.

dbvisit 9: Automatic Failover

Sat, 2019-07-27 06:43

dbvisit 9 is released since a few months. And one new feature I tested is the Automatic Failover. In this blog I suppose that dbvisit 9 is already installed and that the standby database is already created. Indeed I will not describe nor dbvisit installation neither the standby creation as it is the same as the previous versions.
For more info about dbvisit installation and/or dbvisit standby creation please see these steps in my previous blog or dbvisit documentation
The new feature Autamatic Failover needs to install an observer which main functions are:
-Provide remote monitoring of existing DDCs, and inform the DBA of problems in close to real-time
-Automatically perform a Failover of the DDC if previously-specified conditions are met.

We will describe observer installation and configuration later

We describe below the configuration we are using
dbvisit1: primary server with Oracle 19c
dbvisit2: standby server with Oracle 19c
dbvisitconsole : Host of the dbvisit console (dbvserver) and for the observer

As specified earlier, we need to install an observer. It is very easy to do this, just launch the install_dbvisit executable and follow the instructions

[oracle@dbvisitconsole installer]$ pwd
/home/oracle/dbvisit/installer
[oracle@dbvisitconsole installer]$ ./install-dbvisit

-----------------------------------------------------------
    Welcome to the Dbvisit software installer.
-----------------------------------------------------------

    It is recommended to make a backup of our current Dbvisit software
    location (Dbvisit Base location) for rollback purposes.

    Installer Directory /home/oracle/dbvisit

>>> Please specify the Dbvisit installation directory (Dbvisit Base).

    The various Dbvisit products and components - such as Dbvisit Standby,
    Dbvisit Dbvnet will be installed in the appropriate subdirectories of
    this path.

    Enter a custom value or press ENTER to accept default [/usr/dbvisit]:
     > /u01/app/dbvisit
    DBVISIT_BASE = /u01/app/dbvisit

    -----------------------------------------------------------
    Component      Installer Version   Installed Version
    -----------------------------------------------------------
    standby        9.0.02_0_gbd40c486                                not installed                 
    dbvnet         9.0.02_0_gbd40c486                                not installed                 
    dbvagent       9.0.02_0_gbd40c486                                not installed                 
    dbvserver      9.0.02_0_gbd40c486                                9.0.02_0_gbd40c486            
    observer       1.02                                              not installed                 

    -----------------------------------------------------------

    What action would you like to perform?
       1 - Install component(s)
       2 - Uninstall component(s)
       3 - Exit

    Your choice: 1

    Choose component(s):
       1 - Core Components (Dbvisit Standby Cli, Dbvnet, Dbvagent)
       2 - Dbvisit Standby Core (Command Line Interface)
       3 - Dbvnet (Dbvisit Network Communication)
       4 - Dbvagent (Dbvisit Agent)
       5 - Dbvserver (Dbvisit Central Console) - Not available on Solaris/AIX
       6 - Dbvisit Observer (Automatic Failover Option) - Not available on Solaris/AIX
    Press ENTER to exit Installer

    Your choice: 6

-----------------------------------------------------------
    Summary of the Dbvisit OBSERVER configuration
-----------------------------------------------------------
    DBVISIT_BASE /u01/app/dbvisit

    Press ENTER to continue

-----------------------------------------------------------
    About to install Dbvisit OBSERVER
-----------------------------------------------------------
    Component observer installed.

    -----------------------------------------------------------
    Component      Installer Version   Installed Version
    -----------------------------------------------------------
    standby        9.0.02_0_gbd40c486                                not installed                 
    dbvnet         9.0.02_0_gbd40c486                                not installed                 
    dbvagent       9.0.02_0_gbd40c486                                not installed                 
    dbvserver      9.0.02_0_gbd40c486                                9.0.02_0_gbd40c486            
    observer       1.02                                              1.02                          

    -----------------------------------------------------------

    What action would you like to perform?
       1 - Install component(s)
       2 - Uninstall component(s)
       3 - Exit

    Your choice: 3

>>> Installation completed
    Install log /tmp/dbvisit_install.log.201907231647.
[oracle@dbvisitconsole installer]$

And once the installation done, we can start it

[oracle@dbvisitconsole observer]$ ./observersvc start &
[1] 2866

[oracle@dbvisitconsole observer]$ ps -ef | grep obser
oracle    2866  2275  0 14:25 pts/0    00:00:01 ./observersvc start
oracle    2921  2275  0 14:29 pts/0    00:00:00 grep --color=auto obser
[oracle@dbvisitconsole observer]$

After starting the observer we have to add the observer server. This is done from the MANAGE CONFIGURATION TAB from the MENU

From the Configuration TAB, choose the NEW on the left to add a dbvisit observer

And then fill the informations. Note that the default passphrase for the observer is admin900 and then save

To monitor our configuration by the observer, let’s click on Monitor

And then specify the poll interval and the number of retries before a failover happens. In our case

The observer will monitor the configuration every 60 s and will retry 5 times if there is any error.
If after 5 minutes (5×60 secondes), the probleme is not fixed, than an automatic failover will happen.

The observer logfile is located on the observer server

[oracle@dbvisitconsole log]$ pwd
/u01/app/dbvisit/observer/log
[oracle@dbvisitconsole log]$ ls -l
total 8
-rw-r--r--. 1 oracle oinstall 1109 Jul 25 15:24 observer.log
-rw-r--r--. 1 oracle oinstall   97 Jul 25 15:24 orcl_1_observer.log
[oracle@dbvisitconsole log]$

[oracle@dbvisitconsole log]$ tail -f orcl_1_observer.log
2019/07/25 13:24:46 DDC: DDC#1(orcl): Started watchdog: Watchdog successfully started monitoring

Now let’s break the primary database and normally a failover should happen after 5 minutes

[oracle@dbvisit1 log]$ ps -ef | grep pmon
oracle    1887     1  0 14:03 ?        00:00:00 ora_pmon_orcl
oracle   18199  1733  0 15:31 pts/0    00:00:00 grep --color=auto pmon
[oracle@dbvisit1 log]$ kill -9 1887
[oracle@dbvisit1 log]$ ps -ef | grep pmon
oracle   18304  1733  0 15:32 pts/0    00:00:00 grep --color=auto pmon
[oracle@dbvisit1 log]$

In the observer logfile we can see that the standby was promoted after 5 retries.

[oracle@dbvisitconsole log]$ tail -f orcl_1_observer.log
2019/07/25 13:24:46 DDC: DDC#1(orcl): Started watchdog: Watchdog successfully started monitoring
2019/07/25 13:33:51 DDC: DDC#1(orcl): rules failing (1/5): primary: error on dbvisit1:7891: unexpected database status for DDC orcl: got: "Database is down", expected: "Regular database open in read write mode"
2019/07/25 13:34:51 DDC: DDC#1(orcl): rules failing (2/5): primary: error on dbvisit1:7891: unexpected database status for DDC orcl: got: "Database is down", expected: "Regular database open in read write mode"
2019/07/25 13:35:51 DDC: DDC#1(orcl): rules failing (3/5): primary: error on dbvisit1:7891: unexpected database status for DDC orcl: got: "Database is down", expected: "Regular database open in read write mode"
2019/07/25 13:36:51 DDC: DDC#1(orcl): rules failing (4/5): primary: error on dbvisit1:7891: unexpected database status for DDC orcl: got: "Database is down", expected: "Regular database open in read write mode"
2019/07/25 13:37:51 DDC: DDC#1(orcl): rules failing (5/5): primary: error on dbvisit1:7891: unexpected database status for DDC orcl: got: "Database is down", expected: "Regular database open in read write mode"
2019/07/25 13:37:51 DDC: DDC#1(orcl): configuration failed after 5 retries: primary: error on dbvisit1:7891: unexpected database status for DDC orcl: got: "Database is down", expected: "Regular database open in read write mode"
2019/07/25 13:37:51 DDC: DDC#1(orcl): watchdog shutting down: activation imminent
2019/07/25 13:37:51 DDC: DDC#1(orcl): ACTIVATION started: conditions for activation satisfied
2019/07/25 13:38:41 DDC: DDC#1(orcl): ACTIVATION successful: ACTIVATION OK: standby activated, activation took: 50.043794192s

And we can verify that the standby is now open in read write mode

[oracle@dbvisit2 trace]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Thu Jul 25 15:49:38 2019
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Standard Edition 2 Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> select db_unique_name,open_mode from v$database;

DB_UNIQUE_NAME                 OPEN_MODE
------------------------------ --------------------
orcl                           READ WRITE

SQL>

Note that we can use a user defined script with the observer. For more information please see dbvisit documentation

Cet article dbvisit 9: Automatic Failover est apparu en premier sur Blog dbi services.

Alfresco – ActiveMQ basic setup

Fri, 2019-07-26 08:49

Apache ActiveMQ

ActiveMQ is an open source Java Messaging Server (JMS) from the Apache Software Foundation that supports a lot of protocols. In Alfresco 5, ActiveMQ has been introduced as a new, optional, component in the stack. It was, at the beginning, only used for “side” features like Alfresco Analytics or Alfresco Media Management in the early Alfresco 5.0. In Alfresco 6.0, ActiveMQ was still used for Alfresco Media Management but also for the Alfresco Sync Service. It’s only starting with the Alfresco 6.1, released last February, that ActiveMQ became a required component, used for the same things but also now used for transformations.

The Alfresco documentation doesn’t really describe how to install ActiveMQ or how to configure it, it just explains how to connect Alfresco to it. Therefore, I thought I would write a small blog about how to do a basic installation of ActiveMQ for a usage in Alfresco.

Alfresco 6.1 supports ActiveMQ v5.15.6 so that’s the one I will be using for this blog as example.

First let’s start with defining some environment variables that will be used to know where to put ActiveMQ binaries and data:

[alfresco@mq_n1 ~]$ echo "export ACTIVEMQ_HOME=/opt/activemq" >> ~/.profile
[alfresco@mq_n1 ~]$ echo "export ACTIVEMQ_DATA=\$ACTIVEMQ_HOME/data" >> ~/.profile
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ grep "ACTIVEMQ" ~/.profile
export ACTIVEMQ_HOME=/opt/activemq
export ACTIVEMQ_DATA=$ACTIVEMQ_HOME/data
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ source ~/.profile
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ echo $ACTIVEMQ_DATA
/opt/activemq/data
[alfresco@mq_n1 ~]$

 

I’m usually using symlinks for all the components so that I can keep a generic path in case of upgrades, aso… So, let’s download the software and put all that where it should:

[alfresco@mq_n1 ~]$ activemq_version="5.15.6"
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ wget http://archive.apache.org/dist/activemq/${activemq_version}/apache-activemq-${activemq_version}-bin.tar.gz
--2019-07-25 16:55:23--  http://archive.apache.org/dist/activemq/5.15.6/apache-activemq-5.15.6-bin.tar.gz
Resolving archive.apache.org... 163.172.17.199
Connecting to archive.apache.org|163.172.17.199|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 58556801 (56M) [application/x-gzip]
Saving to: ‘apache-activemq-5.15.6-bin.tar.gz’

apache-activemq-5.15.6-bin.tar.gz     100%[=======================================================================>]  55.84M  1.62MB/s    in 35s

2019-07-25 16:55:58 (1.60 MB/s) - ‘apache-activemq-5.15.6-bin.tar.gz’ saved [58556801/58556801]

[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ tar -xzf apache-activemq-${activemq_version}-bin.tar.gz
[alfresco@mq_n1 ~]$ mkdir -p $ACTIVEMQ_HOME-${activemq_version}
[alfresco@mq_n1 ~]$ ln -s $ACTIVEMQ_HOME-${activemq_version} $ACTIVEMQ_HOME
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ ls -l $ACTIVEMQ_HOME/.. | grep -i activemq
lrwxr-xr-x   1 alfresco  alfresco        31 Jul 25 17:04 activemq -> /opt/activemq-5.15.6
drwxr-xr-x   2 alfresco  alfresco        64 Jul 25 17:03 activemq-5.15.6
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ rm -rf ./apache-activemq-${activemq_version}/data
[alfresco@mq_n1 ~]$ mkdir -p $ACTIVEMQ_DATA
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ mv apache-activemq-${activemq_version}/* $ACTIVEMQ_HOME/

 

Once that is done and before starting ActiveMQ for the first time, there are still some configurations to be done. It is technically possible to add a specific authentication for communications between Alfresco and ActiveMQ or setup the communications in SSL for example. It depends on the usage you will have for the ActiveMQ but as a minimal configuration for use with Alfresco, I believe that the default users (“guest” to access docbroker & “user” to access web console) should at least be removed and the admin password changed:

[alfresco@mq_n1 ~]$ activemq_admin_pwd="Act1v3MQ_pwd"
[alfresco@mq_n1 ~]$ activemq_broker_name="`hostname -s`"
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Remove user "user" from the web console
[alfresco@mq_n1 ~]$ sed -i "/^user:[[:space:]]*.*/d" $ACTIVEMQ_HOME/conf/jetty-realm.properties
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Remove user "guest" from the broker
[alfresco@mq_n1 ~]$ sed -i "/^guest.*/d" $ACTIVEMQ_HOME/conf/credentials.properties
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Change admin password
[alfresco@mq_n1 ~]$ sed -i "s/^admin=.*/admin=${activemq_admin_pwd}\n/" $ACTIVEMQ_HOME/conf/users.properties
[alfresco@mq_n1 ~]$ sed -i "s/^admin.*/admin: ${activemq_admin_pwd}, admin/" $ACTIVEMQ_HOME/conf/jetty-realm.properties
[alfresco@mq_n1 ~]$ sed -i "s/^activemq.username=.*/activemq.username=admin/" $ACTIVEMQ_HOME/conf/credentials.properties
[alfresco@mq_n1 ~]$ sed -i "s/^activemq.password=.*/activemq.password=${activemq_admin_pwd}/" $ACTIVEMQ_HOME/conf/credentials.properties
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ grep -E "brokerName|storeUsage |tempUsage " $ACTIVEMQ_HOME/conf/activemq.xml
    <broker xmlns="http://activemq.apache.org/schema/core" brokerName="localhost" dataDirectory="${activemq.data}">
                <storeUsage limit="100 gb"/>
                <tempUsage limit="50 gb"/>
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Set broker name & allowed usage
[alfresco@mq_n1 ~]$ sed -i "s/brokerName=\"[^"]*\"/brokerName=\"${activemq_broker_name}\"/" $ACTIVEMQ_HOME/conf/activemq.xml
[alfresco@mq_n1 ~]$ sed -i 's,storeUsage limit="[^"]*",storeUsage limit="10 gb",' $ACTIVEMQ_HOME/conf/activemq.xml
[alfresco@mq_n1 ~]$ sed -i 's,tempUsage limit="[^"]*",tempUsage limit="5 gb",' $ACTIVEMQ_HOME/conf/activemq.xml
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ grep -E "brokerName|storeUsage |tempUsage " $ACTIVEMQ_HOME/conf/activemq.xml
    <broker xmlns="http://activemq.apache.org/schema/core" brokerName="mq_n1" dataDirectory="${activemq.data}">
                    <storeUsage limit="10 gb"/>
                    <tempUsage limit="5 gb"/>
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ chmod -R o-rwx $ACTIVEMQ_HOME
[alfresco@mq_n1 ~]$ chmod -R o-rwx $ACTIVEMQ_DATA

 

So above, I set a specific name for the broker, that’s mainly if you expect to see at some points several brokers, to differentiate them. I also change the default storeUsage and tempUsage, that’s mainly to show how it’s done because these two parameters define the limit that ActiveMQ will be able to use on the file system. I believe the default is way too much for ActiveMQ’s usage in Alfresco, so I always reduce these or use a percentage as value (percentLimit).

With the default configuration, ActiveMQ uses “${activemq.data}” for the data directory which is actually using the “$ACTIVEMQ_DATA” environment variable, if present (otherwise it sets it as $ACTIVEMQ_HOME/data). That’s the reason why I set this environment variable, so it is possible to define a different data folder without having to change the default configuration. This data folder will mainly contain the logs of ActiveMQ, the PID file and the KahaDB for the persistence adapter.

Finally creating a service for ActiveMQ and starting it is pretty easy as well:

[alfresco@mq_n1 ~]$ cat > activemq.service << EOF
[Unit]
Description=ActiveMQ service

[Service]
Type=forking
ExecStart=###ACTIVEMQ_HOME###/bin/activemq start
ExecStop=###ACTIVEMQ_HOME###/bin/activemq stop
Restart=always
User=alfresco
WorkingDirectory=###ACTIVEMQ_DATA###
LimitNOFILE=8192:65536

[Install]
WantedBy=multi-user.target
EOF
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sed -i "s,###ACTIVEMQ_HOME###,${ACTIVEMQ_HOME}," activemq.service
[alfresco@mq_n1 ~]$ sed -i "s,###ACTIVEMQ_DATA###,${ACTIVEMQ_DATA}," activemq.service
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sudo cp activemq.service /etc/systemd/system/
[alfresco@mq_n1 ~]$ rm activemq.service
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sudo systemctl enable activemq.service
[alfresco@mq_n1 ~]$ sudo systemctl daemon-reload
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sudo systemctl start activemq.service

 

Once ActiveMQ is setup as you want, for the registration in Alfresco, it’s very easy:

[alfresco@alf_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco-global.properties
...
### ActiveMQ
messaging.broker.url=failover:(tcp://mq_n1.domain:61616?daemon=false&dynamicManagement=false&trace=false)?timeout=3000&randomize=false
#messaging.username=
#messaging.password=
...
[alfresco@alf_n1 ~]$

 

As mentioned at the beginning of this blog, ActiveMQ supports a lot of protocols so you can use pretty much what you want: TCP, NIO, SSL, NIO SSL, Peer (2 Peer), UDP, Multicast, HTTP, HTTPS, aso… You can find all the details for that here.

To add authentication between Alfresco and ActiveMQ, you will need to enable the properties in the alfresco-global.properties (the two that I commented above) and define the appropriate authentication in the ActiveMQ broker configuration. There is an example in the Alfresco documentation.

 

Cet article Alfresco – ActiveMQ basic setup est apparu en premier sur Blog dbi services.

Deploying SQL Server 2019 container on RHEL 8 with podman

Wed, 2019-07-24 11:12

Having a fresh install of RHEL8 on my lab environment, I was curious to take a look at new containerization stuff from Red Hat in the context of SQL Server 2019. Good chances are the future version of SQL Server should be available and supported on with the latest version of Red Hat but for now this blog post is purely experimental. This time I wanted to share with you some thoughts about the new Podman command.

First of all, we should be aware that since RHEL8 Red Hat decided to replace docker with CRI-O/podman in order to provide a “daemonless” container world and especially for Kubernetes. By 2016, Kubernetes project introduced the Container Runtime Interface (CRI).  Basically, with CRI, Kubernetes can be container runtime-agnostic. CRI-O that is an open source project initiated by Red Hat the same year that gives the ability to run containers directly from Kubernetes without any unnecessary code or tooling as long as the container remains OCI-compliant. Because Docker is not implemented anymore (and officially not supported) by Red Hat since RHEL8, we need a client tool for working with containers and this is where Podman steps in. To cut the story short, Podman implements almost all the Docker CLI commands and more.

So, let’s have an overview of Podman commands through the installation of a SQL Server 2019 based container. It is worth noting that Podman is not intended to be used in the context of a “standalone” container environnement and should be used with an container orchestrator like K8s or an orchestration platform like OpenShift.  That said,  let’s first create a host directory to persist the SQL Server database files.

$ sudo mkdir -p  /var/mssql/data
$ sudo chmod 755 -R /var/mssql/data

 

Then let’s download the SQL Server 2019 RHEL image. We will use the following Podman command:

$ sudo podman pull mcr.microsoft.com/mssql/rhel/server:2019-CTP3.1
Trying to pull mcr.microsoft.com/mssql/rhel/server:2019-CTP3.1...Getting image source signatures
Copying blob 079e961eee89: 70.54 MiB / 70.54 MiB [========================] 1m3s
Copying blob 1b493d38a6d3: 1.20 KiB / 1.20 KiB [==========================] 1m3s
Copying blob 89e62e5b4261: 333.24 MiB / 333.24 MiB [======================] 1m3s
Copying blob d39017c722a8: 174.82 MiB / 174.82 MiB [======================] 1m3s
Copying config dbba412361d7: 4.98 KiB / 4.98 KiB [==========================] 0s
Writing manifest to image destination
Storing signatures
dbba412361d7ca4fa426387e1d6fc3ec85e37d630bfe70e6599b5116d392394d

 

Note that if you’re already comfortable with the Docker commands, the shift to Podman will be easy thanks to the similarity between the both tools. To get information of the new fresh image, we will use the following Podman command:

$ sudo podman images
REPOSITORY                            TAG           IMAGE ID       CREATED       SIZE
mcr.microsoft.com/mssql/rhel/server   2019-CTP3.1   dbba412361d7   3 weeks ago   1.79 GB
$ sudo podman inspect dbba
…
"GraphDriver": {
            "Name": "overlay",
            "Data": {
                "LowerDir": "/var/lib/containers/storage/overlay/b2769e971a1bdb62f1c0fd9dcc0e9fe727dca83f52812abd34173b49ae55e37d/diff:/var/lib/containers/storage/overlay/4b0cbf0d9d0ff230916734a790f47ab2adba69db44a79c8eac4c814ff4183c6d/diff:/var/lib/containers/storage/overlay/9197342671da8b555f200e47df101da5b7e38f6d9573b10bd3295ca9e5c0ae28/diff",
                "MergedDir": "/var/lib/containers/storage/overlay/b372c0d6ff718d2d182af4639870dc6e4247f684d81a8b2dc2649f8517b9fc53/merged",
                "UpperDir": "/var/lib/containers/storage/overlay/b372c0d6ff718d2d182af4639870dc6e4247f684d81a8b2dc2649f8517b9fc53/diff",
                "WorkDir": "/var/lib/containers/storage/overlay/b372c0d6ff718d2d182af4639870dc6e4247f684d81a8b2dc2649f8517b9fc53/work"
            }
        },
…

 

As show above, Podman uses the CRI-O back-end store directory with the /var/lib/containers path, instead of using the Docker default storage location (/var/lib/docker).

Go ahead and let’s take a look at the Podman info command:

$ podman info
…
OCIRuntime:
    package: runc-1.0.0-54.rc5.dev.git2abd837.module+el8+2769+577ad176.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.0'
…
store:
  ConfigFile: /home/clustadmin/.config/containers/storage.conf
  ContainerStore:
    number: 0
  GraphDriverName: overlay

 

The same kind of information is provided by the Docker info command including the runtime and the graph driver name that is overlay in my case. Generally speaking, creating and getting information of a container with Podman is pretty similar to what we may use with the usual Docker commands. Here  for instance the command to spin up a SQL Server container based on the RHEL image:

$ sudo podman run -d -e 'ACCEPT_EULA=Y' -e \
> 'MSSQL_SA_PASSWORD=Password1'  \
> --name 'sqltest' \
> -p 1460:1433 \
> -v /var/mssql/data:/var/opt/mssql/data:Z \
> mcr.microsoft.com/mssql/rhel/server:2019-CTP3.1
4f5128d36e44b1f55d23e38cbf8819041f84592008d0ebb2b24ff59065314aa4
$ sudo podman ps
CONTAINER ID  IMAGE                                            COMMAND               CREATED        STATUS            PORTS                   NAMES
4f5128d36e44  mcr.microsoft.com/mssql/rhel/server:2019-CTP3.1  /opt/mssql/bin/sq...  4 seconds ago  Up 3 seconds ago  0.0.0.0:1460->1433/tcp  sqltest

 

Here comes the interesting part. Looking at the pstree output we may notice that there is not dependencies with any (docker) daemon with CRI-O implementation. Usually with the Docker implementation we retrieve the containerd daemon and the related shim for the process within the tree. 

$ pstree
systemd─┬─NetworkManager───2*[{NetworkManager}]
        ├─…
        ├─conmon─┬─sqlservr─┬─sqlservr───138*[{sqlservr}]
        │        │          └─{sqlservr}

 

By using the runc command below, we may notice the MSSQL container (identified by the ID here) is actually running through CRI-O and runc runtime.

$ sudo runc list -q
4f5128d36e44b1f55d23e38cbf8819041f84592008d0ebb2b24ff59065314aa4

 

Let’s have a look at the existing namespace. The 9449 PID corresponds to the SQL Server process running in isolation mode through Linux namespaces.

$ sudo lsns 
…
4026532116 net         2  9449 root   /opt/mssql/bin/sqlservr
4026532187 mnt         2  9449 root   /opt/mssql/bin/sqlservr
4026532188 uts         2  9449 root   /opt/mssql/bin/sqlservr
4026532189 ipc         2  9449 root   /opt/mssql/bin/sqlservr
4026532190 pid         2  9449 root   /opt/mssql/bin/sqlservr

$ ps aux | grep sqlservr
root       9449  0.1  0.6 152072 25336 ?        Ssl  05:08   0:00 /opt/mssql/bin/sqlservr
root       9465  5.9 18.9 9012096 724648 ?      Sl   05:08   0:20 /opt/mssql/bin/sqlservr
clustad+   9712  0.0  0.0  12112  1064 pts/0    S+   05:14   0:00 grep --color=auto sqlservr

 

We can double check that the process belongs to the SQL Server container by using the nsenter command:

sudo nsenter -t 17182 --mount --uts --ipc --net --pid sh
sh-4.2# ps aux
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root          1  0.0  0.7 152076 28044 ?        Ssl  Jul23   0:00 /opt/mssql/bin/sqlservr
root          9  2.2 19.7 9034224 754820 ?      Sl   Jul23   0:28 /opt/mssql/bin/sqlservr
root        319  0.0  0.0  13908  3400 ?        S    00:01   0:00 sh
root        326  0.0  0.1  53832  3900 ?        R+   00:02   0:00 ps aux

 

Well, we used different Podman commands to spin up a container that meets the OCI specification like Docker. For a sake of curiosity, let’s build a custom image from a Dockerfile. In fact, this is a custom image we developed for customers to meet our best practices requirements. 

$ ls -l
total 40
drwxrwxr-x. 2 clustadmin clustadmin   70 Jul 24 02:06 BestPractices
drwxrwxr-x. 2 clustadmin clustadmin   80 Jul 24 02:06 DMK
-rw-rw-r--. 1 clustadmin clustadmin  614 Jul 24 02:06 docker-compose.yml
-rw-rw-r--. 1 clustadmin clustadmin 2509 Jul 24 02:06 Dockerfile
-rw-rw-r--. 1 clustadmin clustadmin 3723 Jul 24 02:06 entrypoint.sh
-rw-rw-r--. 1 clustadmin clustadmin 1364 Jul 24 02:06 example.docker-swarm-compose.yml
-rw-rw-r--. 1 clustadmin clustadmin  504 Jul 24 02:06 healthcheck.sh
-rw-rw-r--. 1 clustadmin clustadmin   86 Jul 24 02:06 mssql.conf
-rw-rw-r--. 1 clustadmin clustadmin 4497 Jul 24 02:06 postconfig.sh
-rw-rw-r--. 1 clustadmin clustadmin 2528 Jul 24 02:06 Readme.md
drwxrwxr-x. 2 clustadmin clustadmin   92 Jul 24 02:06 scripts

 

To build an image from a Dockerfile the corresponding Podman command is as follow:

$ sudo podman build -t dbi_mssql_linux:2019-CTP3.1 .
…
--> 5db120fba51f3adc7482ec7a9fed5cc4194f13e97b855d9439a1386096797c39
STEP 65: FROM 5db120fba51f3adc7482ec7a9fed5cc4194f13e97b855d9439a1386096797c39
STEP 66: EXPOSE ${MSSQL_TCP_PORT}
--> 8b5e8234af47adb26f80d64abe46715637bd48290b4a6d7711ddf55c393cd5a8
STEP 67: FROM 8b5e8234af47adb26f80d64abe46715637bd48290b4a6d7711ddf55c393cd5a8
STEP 68: ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
--> 11045806b8af7cf2f67e5a279692e6c9e25212105bcd104ed17b235cdaea97fe
STEP 69: FROM 11045806b8af7cf2f67e5a279692e6c9e25212105bcd104ed17b235cdaea97fe
STEP 70: CMD ["tail -f /dev/null"]
--> bcb8c26d503010eb3e5d72da4b8065aa76aff5d35fac4d7958324ac3d97d5489
STEP 71: FROM bcb8c26d503010eb3e5d72da4b8065aa76aff5d35fac4d7958324ac3d97d5489
STEP 72: HEALTHCHECK --interval=15s CMD [ "/usr/local/bin/healthcheck.sh" ]
--> e7eedf0576f73c95b19adf51c49459b00449da497cf7ae417e597dd39a9e4c8f
STEP 73: COMMIT dbi_mssql_linux:2019-CTP3.1

 

The image built is now available in the local repository:

$ sudo podman images
REPOSITORY                            TAG           IMAGE ID       CREATED         SIZE
localhost/dbi_mssql_linux             2019-CTP3.1   e7eedf0576f7   2 minutes ago   1.79 GB
mcr.microsoft.com/mssql/rhel/server   2019-CTP3.1   dbba412361d7   3 weeks ago     1.79 GB

 

The next step will consist in spinning up a SQL Server container based on this new image. Note that I used a custom parameter DMK=Y to drive the creation of the DMK maintenance tool in our case which including the deployment of a custom dbi_tools database ans related objects that carry out the database maintenance.

$ sudo podman run -d -e 'ACCEPT_EULA=Y' \
> -e 'MSSQL_SA_PASSWORD=Password1' -e 'DMK=Y'  \
> --name 'sqltest2' \
> -p 1470:1433 \
> localhost/dbi_mssql_linux:2019-CTP3.1
d057e0ca41f08a948de4206e9aa07b53450c2830590f2429e50458681d230f6b

 

Let’s check if the dbi_tools has been created during the container runtime phase:

$ sudo podman exec -ti d057 /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P Password1 -Q"SELECT name from sys.databases"
name
--------------------------------------------------------------------------------------------------------------------------------
master
tempdb
model
msdb
dbi_tools

 

Finally, to make the transition with a future blog post, the Podman tool comes with extra commands (under development) that is not available with Docker CLI. The following example generates a YAML deployment file and the corresponding service from an existing container. Please note however that containers with volumes are not supported yet.

The container definition is a follows:

$ sudo podman run -d -e 'ACCEPT_EULA=Y' -e \
'MSSQL_SA_PASSWORD=Password1'  \
--name 'sqltestwithnovolumes' \
-p 1480:1433 \
mcr.microsoft.com/mssql/rhel/server:2019-CTP3.1
7e99581eaec4c91d7c13af4525bfb3805d5b56e675fdb53d0061c231294cd442

 

And we get the corresponding YAML file generated by the Podman command:

$ sudo podman generate kube -s 7e99
# Generation of Kubernetes YAML is still under development!
#
# Save the output of this file and use kubectl create -f to import
# it into Kubernetes.
#
# Created with podman-1.0.2-dev
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: 2019-07-24T03:52:18Z
  labels:
    app: sqltestwithnovolumes
  name: sqltestwithnovolumes
spec:
  containers:
  - command:
    - /opt/mssql/bin/sqlservr
    env:
    - name: PATH
      value: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
    - name: TERM
      value: xterm
    - name: HOSTNAME
    - name: container
      value: oci
    - name: ACCEPT_EULA
      value: "Y"
    - name: MSSQL_SA_PASSWORD
      value: Password1
    image: mcr.microsoft.com/mssql/rhel/server:2019-CTP3.1
    name: sqltestwithnovolumes
    ports:
    - containerPort: 1433
      hostPort: 1480
      protocol: TCP
    resources: {}
    securityContext:
      allowPrivilegeEscalation: true
      capabilities: {}
      privileged: false
      readOnlyRootFilesystem: false
    workingDir: /
status: {}
---
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: 2019-07-24T03:52:18Z
  labels:
    app: sqltestwithnovolumes
  name: sqltestwithnovolumes
spec:
  ports:
  - name: "1433"
    nodePort: 30309
    port: 1433
    protocol: TCP
    targetPort: 0
  selector:
    app: sqltestwithnovolumes
  type: NodePort
status:
  loadBalancer: {}

 

By default the service type NodePort has been created by the command. This latest command needs further testing for sure!

See you

Cet article Deploying SQL Server 2019 container on RHEL 8 with podman est apparu en premier sur Blog dbi services.

SQL Server 2019 availability group R/W connection redirection, routing mesh and load balancing

Tue, 2019-07-23 10:59

SQL Server 2019 availability group feature will provide secondary to primary replica read/write connection redirection. I wrote about it in a previous blog post here. It consists in redirecting client application connections to the primary replica regardless of the target server specified in the connections string. That’s pretty interesting in some scenarios as read scale-out or specific multi-subnet configurations where creating the traditional AG listener is not an viable option.

The new R/W connection redirection capability does the job but the one-million-dollar question here is what’s happen if one of the replicas specified in my connection string becomes suddenly unavailable? Referring  to the BOL  the connection will fail regardless the role that the replica on the target server plays but we can mitigate the issue by introducing the failover partner parameter in the connection string. As a reminder, the Failover Partner keyword in the connection string works in a database mirror setup and prevent prolonged application downtime. But from my point of view, we could go likely another way and get benefit to all the power of this new availability group feature by introducing a load balancer on the top of this topology as we could do with Docker Swarm or K8s architectures. Indeed, if we take a look more closely, this new provided mechanism by SQL Server 2019 is pretty similar to the routing mesh capabilities of container orchestrators with the same advantages and weaknesses as well. I wrote a blog post about Docker Swarm architectures where we need to implement a proxy to load balance the traffic to avoid getting stuck with the routing mesh capability where a node get unhealthy.

I just applied the same kind of configuration by using an HA Proxy (but you can use your own obviously) with my availability group topology and the behavior was basically the same. Here the intended behavior:

 

Here the configuration of my HAProxy including my 3 AG replicas (WIN20191, WIN20192, WIN20193) and a round robin algorithm at the bottom:

…
backend rserve_backend
    mode tcp
    option tcplog
    option log-health-checks
    option redispatch
    log global
    balance roundrobin
    timeout connect 10s
    timeout server 1m
    server WIN20191 192.168.40.205:1433 check
    server WIN20192 192.168.40.206:1433 check
    server WIN20193 192.168.40.207:1433 check

 

Let’s do a try with connections directly to my HAProxy that is listen on port 81 in my test scenario. Note that for this first test I will connect to the master database to force the local connection getting stick to each replica rather than going through the R/W redirection process. The goal is to check if the round-robin algorithm come into play …

$connectionString = "Server=192.168.40.14,81;uid=sa; pwd=xxxx;Integrated Security=False;Initial Catalog=master;pooling=false”

$connection = New-Object System.Data.SqlClient.SqlConnection
$connection.ConnectionString = $connectionString
$connection.Open()

$sqlCommandText="SELECT 'Current server : ' + @@SERVERNAME AS server_name"
$sqlCommand = New-Object system.Data.sqlclient.SqlCommand($sqlCommandText,$connection)
$sqlCommand.ExecuteScalar()

$connection.Close()
$connection.Dispose()

 

… and that’s the case as show below:

Test connexion initial server nb : 0 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 1 - 192.168.40.14,81 - Current server : WIN20192
Test connexion initial server nb : 2 - 192.168.40.14,81 - Current server : WIN20193
Test connexion initial server nb : 3 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 4 - 192.168.40.14,81 - Current server : WIN20192
Test connexion initial server nb : 5 - 192.168.40.14,81 - Current server : WIN20193

 

Let’s do a try by forcing the R/W redirection now. This time I set up the correct target database name named dummy2 for my availability group AG2019.

$connectionString = "Server=192.168.40.14,81;uid=sa; pwd=xxxx;Integrated Security=False;Initial Catalog=dummy2;pooling=false”

$connection = New-Object System.Data.SqlClient.SqlConnection
$connection.ConnectionString = $connectionString
$connection.Open()

$sqlCommandText="SELECT 'Current server : ' + @@SERVERNAME AS server_name"
$sqlCommand = New-Object system.Data.sqlclient.SqlCommand($sqlCommandText,$connection)
$sqlCommand.ExecuteScalar()

$connection.Close()
$connection.Dispose()

Test connexion initial server nb : 0 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 1 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 2 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 3 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 4 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 5 - 192.168.40.14,81 - Current server : WIN20191

 

This time the R/W redirection is taking effect and each established connection is redirected to my primary replica – WIN20191 this time.

Finally, let’s simulate an outage of one of my replicas, let’s say the WIN20193 replica with a turn off operation and let’s see what’s happen below:

Test connexion initial server nb : 32 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 33 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 34 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 35 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 36 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 37 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 38 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 39 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 40 - 192.168.40.14,81 - Current server : WIN20191

 

Well, from a connection perspective nothing has changed and the HAProxy continues to load balance connections between the remaining healthy replicas. The R/W connection redirection mechanism still continue to come into play as well. A quick look at the HAProxy indicates the WIN20193 replica got unhealthy and the HAProxy has evicted this replica from the game.

[WARNING] 203/063813 (1772) : Health check for server rserve_backend/WIN20193 failed, reason: Layer4 timeout, check durati               on: 2001ms, status: 2/3 UP.
[WARNING] 203/063818 (1772) : Health check for server rserve_backend/WIN20193 failed, reason: Layer4 timeout, check durati               on: 2002ms, status: 1/3 UP.
[WARNING] 203/063822 (1772) : Health check for server rserve_backend/WIN20193 failed, reason: Layer4 timeout, check durati               on: 2001ms, status: 0/2 DOWN.
[WARNING] 203/063822 (1772) : Server rserve_backend/WIN20193 is DOWN. 2 active and 0 backup servers left. 2 sessions activ               e, 0 requeued, 0 remaining in queue.
[WARNING] 203/063848 (1772) : Health check for server rserve_backend/WIN20193 failed, reason: Layer4 connection problem, i               nfo: "No route to host", check duration: 4ms, status: 0/2 DOWN.

 

The new R/W redirection capability provided by Microsoft will extend possible scenarios with availability groups for sure. With previous versions of SQL Server, using a load balancer was limited to R/O workloads but SQL Server 2019 will probably change the game on this topic. Let’s see what’s happen in the future!

 

 

 

 

 

Cet article SQL Server 2019 availability group R/W connection redirection, routing mesh and load balancing est apparu en premier sur Blog dbi services.

ORA-01000 and agent13c

Tue, 2019-07-23 03:51

Recently I received errors messages from OEM13c saying too many cursors were opened in a database:

instance_throughput:ORA-01000: maximum open cursors exceeded

My database had currently this open_cursors value:

SQL> show parameter open_cursors

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
open_cursors                         integer     300

I decided to increase its value to 800:

SQL> alter system set open_cursors=800;

System altered.

But a few minutes later I received again the same message. I decided to have a more precise look to discover what’s was happening.

SQL> SELECT  max(a.value) as highest_open_cur, p.value as max_open_cur FROM v$sesstat a, v$statname b, v$parameter p WHERE  a.statistic# = b.statistic#  and b.name = 'opened cursors current' and p.name= 'open_cursors' group by p.value;

HIGHEST_OPEN_CUR     MAX_OPEN_CUR
   300                 800

So I need to  find out which session is causing the error:

SQL> select a.value, s.username, s.sid, s.serial# from v$sesstat a, v$statname b, v$session s where a.statistic# = b.statistic#  and s.sid=a.sid and b.name = 'opened cursors current' and s.username is not null;

     VALUE USERNAME                              SID    SERIAL#
---------- ------------------------------ ---------- ----------
         9 SYS                                     6      36943
         1 SYS                                   108      31137
         1 SYS                                   312      15397
       300 SYS                                   417      31049
        11 SYS                                   519      18527
         7 SYS                                   619      48609
         1 SYS                                   721      51139
         0 PUBLIC                                922         37
        17 SYS                                  1024          1
        14 SYS                                  1027      25319
         1 SYS                                  1129      40925

A sys connection is using 300 cursors :=(( let’s see what it is:

Great the agent 13c is causing the problem :=((

I already encountered this kind of problem on another client’s site. In fact the agent 13c is using metrics to determine the High Availability Disk or Media Backup every 15 minutes and is using a lot of cursors. The best way is to disable those metrics to avoid ORA-01000 errors:

After reloading the agent and reevaluating the alert, the incident disappeared successfully :=)

Cet article ORA-01000 and agent13c est apparu en premier sur Blog dbi services.

Documentum – D2+Pack Plugins not installed correctly

Sun, 2019-07-21 08:43

In a previous blog, I explained how D2 can be installed in silent. In this blog, I will talk about a possible issue that might happen when doing so with the D2+Pack Plugins that aren’t being installed, even if you ask D2 to install them and while there is no message or no errors related to this issue. The first time I had this issue, it was several years ago but I never blogged about it. I faced it again recently so I thought I would this time.

So first, let’s prepare the D2 and D2+Pack packages for the silent installation. I will take the D2_template.xml file from my previous blog as a starting point for the silent parameter file:

[dmadmin@cs_01 ~]$ cd $DOCUMENTUM/D2-Install/
[dmadmin@cs_01 D2-Install]$ ls *.zip *.tar.gz
-rw-r-----. 1 dmadmin dmadmin 491128907 Jun 16 08:12 D2_4.7.0_P25.zip
-rw-r-----. 1 dmadmin dmadmin  61035679 Jun 16 08:12 D2_pluspack_4.7.0.P25.zip
-rw-r-----. 1 dmadmin dmadmin 122461951 Jun 16 08:12 emc-dfs-sdk-7.3.tar.gz
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_4.7.0_P25.zip -d $DOCUMENTUM/D2-Install/
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25.zip -d $DOCUMENTUM/D2-Install/
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Dar-Install.zip -d $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Dar-Install.zip -d $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Dar-Install.zip -d $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/
[dmadmin@cs_01 D2-Install]$ tar -xzvf $DOCUMENTUM/D2-Install/emc-dfs-sdk-7.3.tar.gz -C $DOCUMENTUM/D2-Install/
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ #See the previous blog for the content of the "/tmp/dctm_install/D2_template.xml" file
[dmadmin@cs_01 D2-Install]$ export d2_install_file=$DOCUMENTUM/D2-Install/D2.xml
[dmadmin@cs_01 D2-Install]$ cp /tmp/dctm_install/D2_template.xml ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###WAR_REQUIRED###,true," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$ sed -i "s,###BPM_REQUIRED###,true," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$ sed -i "s,###DAR_REQUIRED###,true," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###DOCUMENTUM###,$DOCUMENTUM," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###PLUGIN_LIST###,$DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar;$DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar;$DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar;," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###JMS_HOME###,$DOCUMENTUM_SHARED/wildfly9.0.1," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###DFS_SDK_PACKAGE###,emc-dfs-sdk-7.3," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ read -s -p "  ----> Please enter the Install Owner's password: " dm_pw; echo; echo
  ----> Please enter the Install Owner's password: <TYPE HERE THE PASSWORD>
[dmadmin@cs_01 D2-Install]$ sed -i "s,###INSTALL_OWNER###,dmadmin," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$ sed -i "s,###INSTALL_OWNER_PASSWD###,${dm_pw}," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s/###DOCBASE_LIST###/Docbase1/" ${d2_install_file}
[dmadmin@cs_01 D2-Install]$

 

Now that the silent file is ready and that all source packages are available, we can start the D2 Installation with the command below. Please note the usage of the tracing/debugging options as well as the usage of the “-Djava.io.tmpdir” Java option to ask D2 to put all tmp files in a specific directory, with this, D2 is supposed to trace/debug everything and use my specific temporary folder:

[dmadmin@cs_01 D2-Install]$ java -DTRACE=true -DDEBUG=true -Djava.io.tmpdir=$DOCUMENTUM/D2-Install/tmp -jar $DOCUMENTUM/D2-Install/D2_4.7.0_P25/D2-Installer-4.7.0.jar ${d2_install_file}

 

The D2 Installer printed the following extract:

...
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Installing plugin: $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/plugin/D2-Widget-Install.jar
...
...
Current line: #################################
Current line: #           Plugins               #
Current line: #################################
Current line: #plugin_1=../C2/C2-Plugin.jar
Updating line with 'plugin_'.
Updating plugin 1 with plugin name: D2-Widget-Plugin.jar and config exclude value of: false
Updating plugin 2 with plugin name: D2-Specifications-Plugin.jar and config exclude value of: false
Current line: #plugin_2=../O2/O2-Plugin.jar
Current line: #plugin_3=../P2/P2-Plugin.jar
...

 

As you can see, there are no errors so if you aren’t paying attention, you might think that the D2+Pack is properly installed. It’s not. At the end of the extract I put above, you can see that the D2 Installer is updating the plugins list with some elements (D2-Widget-Plugin.jar & D2-Specifications-Plugin.jar). If there were no issue, the D2+Pack Plugins would have been added in this section as well, which isn’t the case.

You can check all temporary files, all log files, it will not be printed anywhere that there were an issue while installing the D2+Pack Plugins. In fact, there are 3 things missing:

  • The DARs of the D2+Pack Plugins weren’t installed
  • The libraries of the D2+Pack Plugins weren’t deployed into the JMS
  • The libraries of the D2+Pack Plugins weren’t packaged in the WAR files

There is a way to quickly check if the D2+Pack Plugins DARs have been installed, just look inside the docbase config folder, there should be one log file for the D2 Core DARs as well as one log file for each of the D2+Pack Plugins. So that’s what you should get:

[dmadmin@cs_01 D2-Install]$ cd $DOCUMENTUM/dba/config/Docbase1/
[dmadmin@cs_01 Docbase1]$ ls -ltr *.log
-rw-r-----. 1 dmadmin dmadmin  62787 Jun 16 08:18 D2_CORE_DAR.log
-rw-r-----. 1 dmadmin dmadmin   4794 Jun 16 08:20 D2-C2_dar.log
-rw-r-----. 1 dmadmin dmadmin   3105 Jun 16 08:22 D2-Bin_dar.log
-rw-r-----. 1 dmadmin dmadmin   2262 Jun 16 08:24 D2-O2_DAR.log
[dmadmin@cs_01 Docbase1]$

 

If you only have “D2_CORE_DAR.log”, then you are potentially facing this issue. You could also check the “csDir” folder that you put in the D2 silent parameter file: if this folder doesn’t contain “O2-API.jar” or “C2-API.jar” or “D2-Bin-API.jar”, then you have the issue as well. Obviously, you could also check the list of installed DARs in the repository…

So what’s the issue? Well, you remember above when I mentioned the “-Djava.io.tmpdir” Java option to specifically ask D2 to put all temporary files under a certain location? The D2 Installer, for the D2 part, is using this option without issue… But for the D2+Pack installation, there is actually a hardcoded path for the temporary files which is /tmp. Therefore, it will ignore this Java option and will try instead to execute the installation under /tmp.

This is the issue I faced a few times already and it’s the one I wanted to talk about in this blog. For security reasons, you might have to deal from time to time with specific mount options on file systems. In this case, the “noexec” option was set on the /tmp mount point and therefore D2 wasn’t able to execute commands under /tmp and instead of printing an error, it just bypassed silently the installation. I had a SR opened with the Documentum Support (when it was still EMC) to see if it was possible to use the Java option and not /tmp but it looks like it’s still not solved since I had the exact same issue with the D2 4.7 P25 which was released very recently.

Since there is apparently no way to specify which temporary folder should be used for the D2+Pack Plugins, you should either perform the installation manually (DAR installation + libraries in JMS & WAR files) or remove the “noexec” option on the file system for the time of the installation:

[dmadmin@cs_01 Docbase1]$ mount | grep " /tmp"
/dev/mapper/VolGroup00-LogVol06 on /tmp type ext4 (rw,noexec,nosuid,nodev)
[dmadmin@cs_01 Docbase1]$
[dmadmin@cs_01 Docbase1]$ sudo mount -o remount,exec /tmp
[dmadmin@cs_01 Docbase1]$ mount | grep " /tmp"
/dev/mapper/VolGroup00-LogVol06 on /tmp type ext4 (rw,nosuid,nodev)
[dmadmin@cs_01 Docbase1]$
[dmadmin@cs_01 Docbase1]$ #Execute the D2 Installer here
[dmadmin@cs_01 Docbase1]$
[dmadmin@cs_01 Docbase1]$ sudo mount -o remount /tmp
[dmadmin@cs_01 Docbase1]$ mount | grep " /tmp"
/dev/mapper/VolGroup00-LogVol06 on /tmp type ext4 (rw,noexec,nosuid,nodev)

 

With the workaround in place, the D2 Installer should now print the following (same extract as above):

...
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Line read: Current MAC address : [ Starting to unpack ]
Line read: [ Processing package: core (1/2) ]
Line read: [ Processing package: DAR (2/2) ]
Line read: [ Unpacking finished ]
Line read: [ Writing the uninstaller data ... ]
Line read: [ Automated installation done ]
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Line read: Current MAC address : [ Starting to unpack ]
Line read: [ Processing package: core (1/2) ]
Line read: [ Processing package: DAR (2/2) ]
Line read: [ Unpacking finished ]
Line read: [ Writing the uninstaller data ... ]
Line read: [ Automated installation done ]
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Line read: Current MAC address : [ Starting to unpack ]
Line read: [ Processing package: core (1/2) ]
Line read: [ Processing package: DAR (2/2) ]
Line read: [ Unpacking finished ]
Line read: [ Writing the uninstaller data ... ]
Line read: [ Automated installation done ]
Installing plugin: $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/plugin/D2-Widget-Install.jar
...
...
Current line: #################################
Current line: #           Plugins               #
Current line: #################################
Current line: #plugin_1=../C2/C2-Plugin.jar
Updating line with 'plugin_'.
Updating plugin 1 with plugin name: D2-Widget-Plugin.jar and config exclude value of: false
Updating plugin 2 with plugin name: C2-Plugin.jar and config exclude value of: false
Updating plugin 3 with plugin name: O2-Plugin.jar and config exclude value of: false
Updating plugin 4 with plugin name: D2-Specifications-Plugin.jar and config exclude value of: false
Updating plugin 5 with plugin name: D2-Bin-Plugin.jar and config exclude value of: false
Current line: #plugin_2=../O2/O2-Plugin.jar
Current line: #plugin_3=../P2/P2-Plugin.jar
...

 

As you can see above, the output is quite different: it means that the D2+Pack Plugins have been installed.

 

Cet article Documentum – D2+Pack Plugins not installed correctly est apparu en premier sur Blog dbi services.

Schedule reboots of your AWS instances and how that can result in a hard reboot and corruption

Wed, 2019-07-17 02:50

From time to time you might require to reboot your AWS instances. Maye you applied some patches or for whatever reason. Rebooting an AWS instance can be done in several ways: You can of course do that directly from the AWS console. You can use the AWS command line utilities as well. If you want to schedule a reboot you can either do that using CloudWatch or you can use SSM Maintenance Windows for that. In this post we will only look at CloudWatch and System Manager as these two can be used to schedule the reboot easily using AWS native utilities. You could, of course, do that as well by using cron and the AWS command line utilities but this is not the scope of this post.

For CloudWatch the procedure for rebooting instances is the following: Create a new rule:

Go for “Schedule” and give a cron expression. In this case it means: 16-July-2019 at 07:45. Select the “EC2 RebootInstances API call” and provide the instance IDs you want to have rebooted. There is one limitation: You can only add up to five targets. If you need more then you have to use System Manager as described later in this post. You should pre-create an IAM role with sufficient permissions which you can use for this as otherwise a new one will be created each time.

Finally give a name and a description, that’s it:


Once time reaches your cron expression target the instance(s) will reboot.

The other solution for scheduling stuff against many instances is to use AWS SSM. It requires a bit more preparation work but in the end this is the solution we decided to go for as more instances can be scheduled with one maintenance window (up to 50) and you could combine several tasks, e.g. executing something before doing the reboot and do something else after the reboot.

The first step is to create a new maintenance window:

Of course it needs a name and an optional description:

Again, in this example, we use a cron expression for the scheduling (some as above in the CloudWatch example). Be aware that this is UTC time:

Once the maintenance window is created we need to attach a task to it. Until now we only specified a time to run something but we did not specify what to run. Attaching a task can be done in the task section of the maintenance window:

In this case we go for an “Automation task”. Name and description are not required:

The important part is the document to run, in our case it is “AWS-RestartEC2Instance”:

Choose the instances you want to run the document against:

And finally specify the concurrency and error count and again, an IAM role with sufficient permissions to perform the actions defined in the document:

Last, but not least, specify a pseudo parameter called “{TARGET_ID}” which will tell AWS SSM to run that against all the instances you selected in the upper part of the screen:

That’s it. Your instances will be rebooted at the time you specified in the cron expression. All fine and easy and you never have to worry about scheduled instance reboots. Just adjust the cron expression and maybe the list of instances and you are done for the next scheduled reboot. Really? We did it like that against 100 instances and we got a real surprise. What happened? Not many, but a few instances have been rebooted hard and one of them even needed to be restored afterwards. Why that? This never happened in the tests we did before. When an instance does not reboot within 4 minutes AWS performs a hard reboot. This can lead to corruption as stated here. When you have busy instances at the time of the reboot this is not what you want. On Windows you get something like this:

You can easily reproduce that by putting a Windows system under heavy load with a cpu stress test and then schedule a reboot as described above.

In the background the automation document calls aws:changeInstanceState and that comes with a force parameter:

… and here we have it again: Risk of corruption. When you take a closer look at the automation document that stops an EC2 instance you can see that as well:

So what is the conclusion of all this? It is not to blame AWS for anything, all is documented and works as documented. Testing in a test environment does not necessarily mean it works on production as well. Even if it is documented you might not expect it because your tests went fine and you missed that part of the documentation where the behavior is explained. AWS System Manager still is a great tool for automating tasks but you really need to understand what happens before implementing it in production. And finally: Working on public clouds make many things easier but others harder to understand and troubleshoot.

Cet article Schedule reboots of your AWS instances and how that can result in a hard reboot and corruption est apparu en premier sur Blog dbi services.

Email Spoofing

Mon, 2019-07-15 11:07

Have you ever had this unhealthy sensation of being accused of facts that do not concern you? To feel helpless in the face of an accusing mail, which, because of its imperative and accusing tone, has the gift of throwing us the opprobrium?

This is the purpose of this particular kind of sextortion mail that uses spoofing, to try to extort money from you. A message from a supposed “hacker” who claims to have hacked into your computer. He threatens you with publishing compromising images taken without your knowledge with your webcam and asks you for a ransom in virtual currency most of the time.

Something like that:

 

Date:  Friday, 24 May 2019 at 09:19 UTC+1
Subject: oneperson
Your account is hacked! Renew the pswd immediately!
You do not heard about me and you are definitely wondering why you’re receiving this particular electronic message, proper?
I’m ahacker who exploitedyour emailand digital devicesnot so long ago.
Do not waste your time and make an attempt to communicate with me or find me, it’s not possible, because I directed you a letter from YOUR own account that I’ve hacked.
I have started malware to the adult vids (porn) site and suppose that you watched this website to enjoy it (you understand what I mean).
Whilst you have been keeping an eye on films, your browser started out functioning like a RDP (Remote Control) that have a keylogger that gave me authority to access your desktop and camera.
Then, my softaquiredall data.
You have entered passcodes on the online resources you visited, I intercepted all of them.
Of course, you could possibly modify them, or perhaps already modified them.
But it really doesn’t matter, my app updates needed data regularly.
And what did I do?
I generated a reserve copy of every your system. Of all files and personal contacts.
I have managed to create dual-screen record. The 1 screen displays the clip that you were watching (you have a good taste, ha-ha…), and the second part reveals the recording from your own webcam.
What exactly must you do?
So, in my view, 1000 USD will be a reasonable amount of money for this little riddle. You will make the payment by bitcoins (if you don’t understand this, search “how to purchase bitcoin” in Google).
My bitcoin wallet address:
1816WoXDtSmAM9a4e3HhebDXP7DLkuaYAd
(It is cAsE sensitive, so copy and paste it).
Warning:
You will have 2 days to perform the payment. (I built in an exclusive pixel in this message, and at this time I understand that you’ve read through this email).
To monitorthe reading of a letterand the actionsin it, I utilizea Facebook pixel. Thanks to them. (Everything thatis usedfor the authorities may helpus.)

In the event I do not get bitcoins, I shall undoubtedly give your video to each of your contacts, along with family members, colleagues, etc?

 

Users who are victims of these scams receive a message from a stranger who presents himself as a hacker. This alleged “hacker” claims to have taken control of his victim’s computer following consultation of a pornographic site (or any other site that morality would condemn). The cybercriminal then announces having compromising videos of the victim made with his webcam. He threatens to publish them to the victim’s personal or even professional contacts if the victim does not pay him a ransom. This ransom, which ranges from a few hundred to several thousand dollars, is claimed in a virtual currency (usually in Bitcoin but not only).

To scare the victim even more, cybercriminals sometimes go so far as to write to the victim with his or her own email address, in order to make him or her believe that they have actually taken control of his or her account. 

First of all, there is no need to be afraid of it. Indeed, if the “piracy” announced by cybercriminals is not in theory impossible to achieve, in practice, it remains technically complex and above all time-consuming to implement. Since scammers target their victims by the thousands, it can be deduced that they would not have the time to do what they claim to have done. 

These messages are just an attempt at a scam. In other words, if you receive such a blackmail message and do not pay, nothing more will obviously happen. 

Then, no need to change your email credentials. Your email address is usually something known and already circulates on the Internet because you use it regularly on different sites to identify and communicate. These sites have sometimes resold or exchanged their address files with different partners more or less scrupulous in marketing objectives.

If cybercriminals have finally written to you with your own email address to make you believe that they have taken control of it: be aware that the sender’s address in a message is just a simple display that can very easily be usurped without having to have a lot of technical skills. 

In any case, the way to go is simple: don’t panic, don’t answer, don’t pay, just throw this mail in the trash (and don’t forget to empty it regularly). 

On the mail server side, setting up certain elements can help to prevent this kind of mail from spreading in the organization. This involves deploying the following measures on your mail server:

  •       SPF (Sender Policy Framework): This is a standard for verifying the domain name of the sender of an email (standardized in RFC 7208 [1]). The adoption of this standard is likely to reduce spam. It is based on the SMTP (Simple Mail Transfer Protocol) which does not provide a sender verification mechanism. SPF aims to reduce the possibility of spoofing by publishing a record in the DNS (Domain Name Server) indicating which IP addresses are allowed or forbidden to send mail for the domain in question.
  •         DKIM (DomainKeys Identified Mail): This is a reliable authentication standard for the domain name of the sender of an email that provides effective protection against spam and phishing (standardized in RFC 6376 [2]). DKIM works by cryptographic signature, verifies the authenticity of the sending domain and also guarantees the integrity of the message.
  •       DMARC (Domain-based Message Authentication, Reporting and Conformance): This is a technical specification to help reduce email misuse by providing a solution for deploying and monitoring authentication issues (standardized in RFC 7489 [3]). DMARC standardizes the way how recipients perform email authentication using SPF and DKIM mechanisms.

 

REFERENCES

[1] S. Kitterman, “Sender Policy Framework (SPF),” ser. RFC7208, 2014, https://tools.ietf.org/html/rfc7208

[2] D. Crocker, T. Hansen, M. Kucherawy, “DomainKeys Identified Mail (DKIM) Signatures” ser. RFC6376, 2011,  https://tools.ietf.org/html/rfc6376

[3] M. Kuchewary, E. Zwicky, “Domain-based Message Authentication, Reporting and Conformance (DMARC)”, ser. RFC7489, 2015, https://tools.ietf.org/html/rfc7489

Cet article Email Spoofing est apparu en premier sur Blog dbi services.

Migrating your users from md5 to scram authentication in PostgreSQL

Thu, 2019-07-11 03:43

One of the new features in PostgreSQL 10 was the introduction of stronger password authentication based on SCRAM-SHA-256. How can you migrate your existing users that currently use md5 authentication to the new method without any interruption? Actually that is quite easy, as you will see in a few moments, but there is one important point to consider: Not every client/driver does already support SCRAM-SHA-256 authentication so you need to check that before. Here is the list of the drivers and their support for SCRAM-SHA-256.

The default method that PostgreSQL uses to encrypt password is defined by the “password_encryption” parameter:

postgres=# show password_encryption;
 password_encryption 
---------------------
 md5
(1 row)

Let’s assume we have a user that was created like this in the past:

postgres=# create user u1 login password 'u1';
CREATE ROLE

With the default method of md5 the hashed password looks like this:

postgres=# select passwd from pg_shadow where usename = 'u1';
               passwd                
-------------------------------------
 md58026a39c502750413402a90d9d8bae3c
(1 row)

As you can see the hash starts with md5 so we now that this hash was generated by the md5 algorithm. When we want this user to use scram-sha-256 instead, what do we need to do? The first step is to change the “password_encryption” parameter:

postgres=# alter system set password_encryption = 'scram-sha-256';
ALTER SYSTEM
postgres=# select pg_reload_conf();
 pg_reload_conf 
----------------
 t
postgres=# select current_setting('password_encryption');
 current_setting 
-----------------
 scram-sha-256
(1 row)

From now on the server will use scram-sha-256 and not anymore md5. But what happens when our user wants to connect to the instance once we changed that? Currently this is defined in pg_hba.conf:

postgres=> \! grep u1 $PGDATA/pg_hba.conf
host    postgres        u1              192.168.22.1/24         md5

Even though the default is not md5 anymore the user can still connect to the instance because the password hash did not change for that user:

postgres=> \! grep u1 $PGDATA/pg_hba.conf
host    postgres        u1              192.168.22.1/24         md5

postgres@rhel8pg:/home/postgres/ [PGDEV] psql -h 192.168.22.100 -p 5433 -U u1 postgres
Password for user u1: 
psql (13devel)
Type "help" for help.

postgres=> 

Once the user changed the password:

postgres@rhel8pg:/home/postgres/ [PGDEV] psql -h 192.168.22.100 -p 5433 -U u1 postgres
Password for user u1: 
psql (13devel)
Type "help" for help.

postgres=> \password
Enter new password: 
Enter it again: 
postgres=> 

… the hash of the new password is not md5 but SCRAM-SHA-256:

postgres=# select passwd from pg_shadow where usename = 'u1';
                                                                passwd                               >
----------------------------------------------------------------------------------------------------->
 SCRAM-SHA-256$4096:CypPmOW5/uIu4NvGJa+FNA==$PNGhlmRinbEKaFoPzi7T0hWk0emk18Ip9tv6mYIguAQ=:J9vr5CQDuKE>
(1 row)

One could expect that from now on the user is not able to connect anymore as we did not change pg_hba.conf until now:

postgres@rhel8pg:/home/postgres/ [PGDEV] psql -h 192.168.22.100 -p 5433 -U u1 postgres
Password for user u1: 
psql (13devel)
Type "help" for help.

postgres=> 

But in reality that still works as the server now uses the SCRAM-SHA-256 algorithm. So once all the users changed their passwords you can safely switch the rule in pg_hba.conf and you’re done:

postgres=> \! grep u1 $PGDATA/pg_hba.conf
host    postgres        u1              192.168.22.1/24         scram-sha-256

postgres=# select pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

You just need to make sure that all the users do not have a hash starting with md5 but the new one starting with SCRAM-SHA-256.

Cet article Migrating your users from md5 to scram authentication in PostgreSQL est apparu en premier sur Blog dbi services.

Pages