Wednesday 29 November 2017

OpenStack shrink image virtual disk size

When OpenStack creates a snapshot, the image is stored as qcow2 format with Glance, in the directory of /var/lib/glance/images/.

However, sometimes the virtual disk size of the image exceeds the disk size of a flavor, even for the same type of flavor of the original VM instance. This is because the created snapshot includes more than the main partition (/dev/sda1, /dev/sda2, ...) so that the total amount of disk exceeds the disk size of the original flavor.

Admin can change the size of these disk images using guestfish and virt-resize tools in guestfs library. Expanding disk size is easy, but shrinking is a bit more complicated. Follow this guideline to shrink the virtual size of OpenStack image files, or any image supported by guestfs.


1. Check the image detail

$ qemu-img info ./disk.img
file format: qcow2
virtual size: 11G (11811160064 bytes)
disk size: 2.0G
cluster_size: 65536

=> This image needs 11 GB disk size of a flavor, while the actual disk size is only 2.0G.
We will change the virtual size of 11 G to 5 G, so that it can be deployed in a smaller VM flavor.

2. Install guestfs-tools and lvm2

$ yum install lvm2 lib libguestfs-tools

If you're running this on a dedicated Glance server without libvirt, change the setting to bypass libvirt back-end:

export LIBGUESTFS_BACKEND=direct

3. Check the detailed partition info of the image.

$ virt-filesystems --long --parts --blkdevs -h -a ./disk.img
Name       Type       MBR  Size  Parent
/dev/sda1  partition  83   10G   /dev/sda
/dev/sda2  partition  83   512M  /dev/sda
/dev/sda   device     -    11G   -

$ virt-df ./disk.img
Filesystem            1K-blocks       Used  Available  Use%
disk.img:/dev/sda1      5016960    1616912    3383664   33%

=> /dev/sda1 is the main partition and used only ~2 GB. This partition will be resized to 4G, which will make the total size of the image 5G. 

Note that the size of main partition must be less than the intended size of the image even if there is only one partition. For example, if you want to make 5G image, the size of /dev/sda1 must be less than 5G, like 4G or 3.5G. 

4. Use guestfish for resizing

It is good idea to make a back up before proceeding, because the following commands may harm the image partition.

$ guestfish -a ./disk.img
><fs> run
><fs> list-filesystems
...
><fs> e2fsck-f /dev/sda1
><fs> resize2fs-size /dev/sda1 4G
><fs> exit

=> This changes only metadata, and does not apply to the actual image data yet.
=> Again, the size of /dev/sda1 must be smaller than the intended size of the entire image. We set 4G here, thus the final image can be 5G.

5. Use virt-resize for actual resizing.

First, you have to create an empty image to copy the shrunk image. Please specify the size of the entire image here.

$ qemu-img create -f qcow2 -o preallocation=metadata newdisk.qcow2 5G

Once the new image file is created, use virt-resize command to copy the data from old file to new image file.

$ virt-resize --shrink /dev/sda1 ./disk.img ./newdisk.qcow2

Now, you have a new image file with the same content of the old image. /dev/sda1 will be shrunk as well as extended to fill up the empty size of the newdisk.qcow2 image file. In this example, /dev/sda1 will become 4.5G at the end, because the rest of 5G image will be occupied by this partition.


6. Check the size and partition information of the new image.


$ qemu-img info ./newdisk.qcow2
file format: qcow2
virtual size: 5.0G
disk size: 1.9G
cluster_size: 65536

$ virt-filesystems --long --parts --blkdevs -h -a ./newdisk.qcow2
Name       Type       MBR  Size  Parent
/dev/sda1  partition  83   4.5G  /dev/sda
/dev/sda2  partition  83   512M  /dev/sda
/dev/sda   device     -    5.0G  -

$ virt-df ./newdisk.qcow2
Filesystem              1K-blocks       Used  Available  Use%
newdisk.qcow2:/dev/sda1   5522976    1616912    3889680   30%

7. Shrink the image file size

If the image file is too big, it is possible to reduce the size of the disk by using qemu-img util.

$ qemu-img convert -O qcow2 ./newdisk.qcow ./newdisk2.qcow


8. Upload the new image to OpenStack

The new image file cannot be recognised by OpenStack if you just swap the image file. It will generate an error message "Not authorized for image ....." when you create a VM with the image.

Instead of swapping the image file, use OpenStack command to register the new image.
$ openstack image create --disk-format qcow2 --container-format bare --private --file ./newdisk.qcow2 My_Image_Shrunk

=> This command will add the new image file as "My_Image_Shrunk" image, which can be used to create a new instance.


Saturday 11 November 2017

Understanding OpenStack networking architecture concept: Easy description.

OpenStack is a popular open-source cloud management software adopted by many enterprises to deploy a small to medium size cloud computing. When building a cloud data center, system manager has to consider how to build the network.

For a tiny scale cloud for private usage, it is trivial to create a simple single network for all traffics regardless of the traffic characteristic. Whereas, larger scale cloud for multi tenants (e.g., in university or a small company) needs to consider more aspects, mostly security. System administrator does not want to let their tenants to access the physical infrastructure through a public intranet. Physical infrastructure has to be hidden for security reasons. In addition to the security, the network performance and availability have to be considered to design the network. In this regard, a larger scale cloud data center will adopt multiple networks for different purposes, such as management network, data network, tenant network, etc.


< Terms and Definitions >


Many documents on Internet suggest and explain network architecture with different purposes, but they are somewhat confusing and easy to be misunderstanding. In this article, let me clarify the terms of different networks used in many documents regarding OpenStack and possibly general datacenter networks.

1. Management network vs tenant network: used by whom?

Let's think about 'who' will use the network. Is it used by system administrator of OpenStack, or the tenants who want to use the VMs? If the network is used by system administrator who manages the data center, it is called 'management network'. Common usage of management network is to access each physical node to configure and maintain the OpenStack components such as Nova or Neutron.

On the other hand, 'tenant network' is used by the cloud tenants. Main traffic of tenant network will be between VMs and from outside to the VM used by end-users and the tenant. Note that this is regardless of 'internal' or 'external' network. We will discuss about the difference between tenants network and internal network later.

2. Control network vs data network: used for what?

Very similar terms compared to 'management' and 'tenants', but these terms focus more on 'purpose' of the network, rather than users. Control network is for control packets, while data network is for data. Think about an image data traffic from the Glance node to a compute node, or a VM migration between two compute nodes. They are for management, but also they are data traffic rather than a controlling traffic.

Control traffic will be a controlling command like 'send ImageA from Glance node to Compute1 node' instructed to the Glance and Compute1 nodes sent by the controller, while the actual image data from Glance to Compute1 will be a data traffic. Although both are originated from the cloud manager for management, the characteristic of the network traffic is different.

3. Internal vs external: IP address range or VM network?

These are more general terms used in common which is why it's often easy to get confused. In a general network term, internal network is a network with private IP address ranges (192.168.x.x or 10.x.x.x or 172.x.x.x, ...), whereas external network has an external IP address range so that they are open to public.

However, in cloud computing, they can have slightly different meaning. Internal network is a network for VMs that can be accessed from only inside of the VM network, while external network is a network that are exposed to outside of the VM network. This can be very confused especially for a data center built in intranet. Intranet itself already uses their private IP address which will be the 'external' network of the VMs in the data center.

For example, a company uses 192.168.x.x range for their intranet. All desktops and laptops use this IP address. One employee of this company (a tenant) creates a VM from the internal cloud. This VM should be able to get connected through the intranet IP range (192.168.x.x). Although this is internal network for the company, it can be the external network for VMs at the same time.

4. Physical vs virtual network.

Physical network is for physical nodes to communicate between the controller and other nodes, while virtual network is for VMs. This is a high-level term, so that all the virtual network will use the underneath physical network in reality. We separate them only to differentiate on a network layer.

5. Physical layer vs network layer.

The most common confusing concept in networking comes from the separation of layers. So far, we talked about networks in "network layer", a.k.a. "IP layer", not about a physical or link layer. Note that all the aforementioned networks can be served through a single physical medium by differentiating IP range with a proper router setting in the host.


6. Other terms: API, provider, guest, storage, logical, flat, etc, etc, etc network...

Depending on the article, there are so many terms used in various ways. Let's skip the explanation of each term, and look at the example to discuss about the network architecture.


< Example network architecture >

Let's build a data center with different networks. This data center is connected to both Internet and intranet directly, which means VMs can be accessed from both Internet and the intranet.

* Internet IP range: 123.4.x.x
* Intranet IP range: 172.16.x.x

These two networks are called "public", "provider", or "external" network, which means they are exposed to outside of the data center. VM can acquire either, or both of the IP address to let everyone all over the world (via internet) or within a company (via intranet) to connect to the VM.

* Provider physical network: External IP range only for VM (none for physical nodes)

This is not a IP network, but L2 network to connect between the VM's L3 to the external network. VMs will acquire an Internet or intranet IP address from a network node or external DHCP server and use this physical network to communicate to Internet or Intranet.
Note that the physical interface in the compute nodes will not get any IP address for this network. They only provide L2 service to connect between the hosted VMs and the external network. Thus, tenants cannot access to the physical nodes through this network, although it is exposed to the outside directly. In Neutron, this is configured with 'flat' type.

* Management physical network (Control network): 192.168.100.x

Within a data center, all physical nodes (controller, network, compute, Glance, etc nodes) are connected to this network for system management. Only the data center manager can access to this network.

* Virtual network using VxLAN: any IP range
This is a self-management network created/managed by a tenant. It's a virtual network using tunneling protocol such as VxLAN, thus any IP range can be assigned by a tenant. Note that this network can be accessed only within a VM in a same virtual network. Physical network used by these traffic can be either combined with the physical management network or separated as a tenant data network.

* Tenant data network (optional): 192.168.200.x

For virtual network traffic between VMs, we can set up a separate physical network only for tenant's data distinguished from the management network. If it's not physically separated, the virtual network traffic between VMs shares the management network.

* Management data network (optional)

If you want to separate a data traffic for management purpose from the control command traffic, it is also possible to use a different physical network for that purpose. In most cases, these control data traffics will share the physical management network including in this example (192.168.100.x). Sometimes it can be configured to use the tenant data network (192.168.200.x) for a specific reason including VM migration or image data separation.

* Storage data network (optional)

For further separation, storage data network can be physically separated to provide a stable and high-performance object or volume storage service. If it is not separated, it can be used by any of the data network or the physical management network.

In this example, we only consider the tenant data network. The other optional network


< Physical interfaces for the example >

* Router/DHCP server: provides connectivity between Internet, intranet and the data center network. 

-eth0 - to outside Internet (public)
-eth1 - to outside intranet (public)
-eth2/3 - to Provider physical network for Internet (123.4.0.1) and intranet (172.16.0.1): GW for VMs
-eth4 - to Control network (192.168.100.1): NAT for control network and provide public access to the tenant's UI from the controller node.

In this configuration, eth2 and eth3 are both connected to the same physical network. Router needs more sophisticated settings for firewall and routing rules for more security. Basically it connects four different physical networks in this scenario: two public networks, provider network, and the secret control network. Forwarding rules should be set up to allow only incoming traffic from external (eth0/1) to reach VMs on eth2/3 or the controller node eth4. The outgoing traffic from eth2/3/4 can freely access to the Internet.

Note that eth2/3 can be detached from the router if you want to make Neutron to control them. In that case, an external network (123.4.x.x/172.16.x.x) are connected to only network node running Neutron L3 routing agent, and Neutron provides DHCP and routing for VMs.

* Controller node

-eth0 - Control network (192.168.100.x)
-eth1 - Internet/intranet through router to provide UIs.

Receives a VM creation request or any other requests from tenants through horizon, and sends the command to other physical nodes by calling their APIs through eth0 control network. 


* Network node

-eth0 - Control network (192.168.100.x)
-eth1 - Tenant data network (192.168.200.x): manages virtual tunnel networks
-eth2 - Tenant provider network (No IP): manages

Network node is in charge of managing virtual network, e.g. providing DHCP for VxLAN, L3 routing between different VxLANs and controlling 'public' access to VMs.


* Compute nodes

-eth0 - Control network (192.168.100.x)
-eth1 - Tenant data network (192.168.200.x): VM-to-VM intra-DCN traffic, e.g. VxLAN
-eth2 - Tenant provider network (No IP): VM's external connectivity to Internet or intranet

Nova API is listening on eth0 to get any command from the controller. VMs hosted in the compute node use eth1 for intra-DCN traffic sending to another VM using the 'virtual network' IP range. These packets are encapsulated in VxLAN or other tunneling protocol and sent to another compute node using tenant data network. Note that 192.168.200.x is assigned only for physical compute nodes to find another compute node hosting the destination VM. Thus, 192.168.200.x cannot be accessible by VMs. VMs can access only other VMs because all the packets sent from VMs to eth1 are encapsulated into a tunneling protocol.

On the other hand, eth2 is directly connected to VMs without any tunneling protocol which makes it so called 'flat' network. Provider network is connected to VMs using this interface. When a VM needs to get an access to Internet, Neutron controller creates a virtual interface within a VM, which can acquire a public IP address from the router or Neutron. For a floating IP, the virtual interface is created within a Neutron controller that forwards the traffic to the assigned VM through this network.

< Conclusion >

There are so many terms that may confuse people. In this article I was trying to explain them as easy as possible, but they are still a bit confusing. The most important and easy to get confused is the mixed understanding of networks in different layers. Whenever you think of network architecture, you must separate the concept based on a network layer. Do not mix a L3 (IP) layer with an underneath L2 or physical layer. On a single physical layer, multiple networks can be operating in higher layer. Think about your desktop computer. It has only one IP address (L3) and MAC address (L2), but it can run so many applications using TCP/UDP port numbers. With a same reason, multiple IP network can exist in a same ethernet. You can assign many IP addresses to one NIC interface on Windows or Linux. If you can separate them and think more logically considering the features I discussed above, it won't be so confusing about what others talk about.

For more information, look at this OpenStack document. It is a bit outdated, but provides much more information than a recent document.
https://docs.openstack.org/liberty/networking-guide/deploy.html

Thursday 26 October 2017

OpenStack monitoring - using Ceilometer and Gnocchi

OpenStack has its own monitoring tool - Ceilometer. It has to be installed separately because it's not shipped with OpenStack by default.

On the recent version of OpenStack, Ceilometer has been separated into two projects, Ceilometer and Gnocchi. Ceilometer is in charge of polling monitored matrices, while Gnocchi is to collect the data and deliver to the user. OpenStack named them as "Telemetry Service" which in fact is combination of Ceilometer, Gnocchi, and other software modules.

Because of this complex history, some articles and answers about how to use Ceilometer on Internet are outdated and do not apply to the current version.
In this article, we will look at how to install Ceilometer and Gnocchi on OpenStack Pike (most recent version) with some examples, .

1. Install Gnocchi and Ceilometer on Controller node
Gnocchi is in charge of collecting and storing the monitored data, and providing the data to the user. Simply speaking, its role is same as a database system which stores and retrieve data. In fact, Gnocchi uses database system to store the data and/or the index of the data.

Follow the instruction with a caution of gnocchi:
https://docs.openstack.org/ceilometer/pike/install/install-controller.html#

As the document is outdated, it does not include installation process of gnocchi. If you encounter any problem because of gnocchi, install gnocchi separately using its own document:
http://gnocchi.xyz/install.html#id1

Although Gnocchi is started from Ceilometer project, it is now a separate project from Ceilometer. Be aware of this, and whenever you encounter any issue regarding Gnocchi, find the solution on Gnocchi site, not from Ceilometer resources.

Gnocchi is composed of several components. gnocchi-metricd and gnocchi-statd are services running background to collect data from Ceilometer and other monitoring tools. If these services are not running properly, you can still use gnocchi client to retrieve the resource list, but the measured data will be empty because it cannot collect the monitored data.

While metricd and statd are in charge of data collection, WSGI application running through Apache httpd is providing API for Gnocchi client. This web application uses a port 8041 by default, which is also set up as end-point for OpenStack.

Gnocchi-client is in charge of communicating with the gnocchi API using port 8041, to retrieve the monitored and stored data from gnocchi storage.

During the installation, you may choose how to store gnocchi data as well as how to index them. Default DevStack setting is to store the data as a file, and use mysql for indexing them.

If you want to monitor other services such as Neutron, Glance, etc, the above link also has an instruction how to configure to monitor them.

2. Install Ceilometer on Compute nodes
Follow the installation guide provided by OpenStack:
https://docs.openstack.org/ceilometer/pike/install/install-compute-rdo.html

Note that ceilometer-compute should be installed on every compute node which is in charge of monitor the compute service (cpu, memory, and so on for VM instances).

3. Check Gnocchi and Ceilometer and troubleshooting
Once all installation is done, you should be able to use gnocchi client to retrieve the monitored data.
Follow the verify instruction:
https://docs.openstack.org/ceilometer/pike/install/verify.html

If "gnocchi resource list" command does not work, there is a problem on Gnocchi API running on httpd.
One of the possible reason is related to Redis server, which is used by gnocchi API and other OpenStack services to communicate each other. Check port 6379 which supposed to be listening by Redis server.

If gnocchi resource list is working but "gnocchi measures show ..." returns empty result, it means gnocchi is not collecting any data from Ceilometer. First of all, check gnocchi-statd and gnocchi-metricd. If they are not running properly, gnocchi cannot gather data. Also, check ceilometer settings to make sure that it is monitoring and reporting correctly.

If gnocchi measures are not updating correctly for some reason, it's good practice to update gnocchi / ceilometer / database schema using these commands:

$ gnocchi-upgrade
$ ceilometer-upgrade --skip-metering-database

4. Monitor hosts (hypervisors)
Ceilometer monitors only VM instances in a host by default. If you want to monitor the compute hosts themselves for VM provisioning, add the following lines into nova.conf.

[DEFAULT]
compute_monitors = cpu.virt_driver,numa_mem_bw.virt_driver

After the successful configuration, "gnocchi resource list" will show one more resource called "nova_compute".

If Gnocchi reports there is no such a resource, it's probably because your Ceilometer version is old. In the old Ceilometer, it did not create a nova_compute resource in Gnocchi. Check your Ceilometer log, there will be error messages like:

metric compute.node.cpu.iowait.percent is not handled by Gnocchi

If so, update your Ceilometer version, or fix it by changing Ceilometer source code and /etc/ceilometer/gnocchi_resources.yaml file.

Refer to this commit message:
https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=5e430aeeff8e7c641e4b19ba71c59389770297ee


5. Sample commands and results

To retrieve all monitored VM instances (one resource ID corresponds to one VM instance):
$ gnocchi resource list -t instance -c id -c user_id -c flavor_name
+--------------------------------------+----------------------------------+-------------+
| id                                   | user_id                          | flavor_name |
+--------------------------------------+----------------------------------+-------------+
| 2e3aa7f0-4280-4d2a-93fb-59d6853e7801 | e78bd5c6d4434963a5a42924889109da | m1.nano     |
| a10ebdc8-c8bd-452c-958c-d811baaf0899 | e78bd5c6d4434963a5a42924889109da | m1.nano     |
| 08c6ea86-fe1f-4636-b59e-2b1414c978a0 | e78bd5c6d4434963a5a42924889109da | m1.nano     |
+--------------------------------------+----------------------------------+-------------+

To retrieve the CPU utilization of 3rd VM instance from above result:
$ gnocchi measures show cpu_util --resource-id 08c6ea86-fe1f-4636-b59e-2b1414c978a0
+---------------------------+-------------+----------------+
| timestamp                 | granularity |          value |
+---------------------------+-------------+----------------+
| 2017-10-25T14:55:00+00:00 |       300.0 | 0.177451422033 |
| 2017-10-25T15:00:00+00:00 |       300.0 |   0.1663312144 |
| 2017-10-25T15:05:00+00:00 |       300.0 |   0.1778018934 |
+---------------------------+-------------+----------------+

To retrieve the resource ID of compute hosts:
$ gnocchi resource list -t nova_compute -c id -c host_name
+--------------------------------------+------------------+
| id                                   | host_name        |
+--------------------------------------+------------------+
| 52978e00-6322-5498-9c9a-40fc5dca9571 | compute.devstack |
+--------------------------------------+------------------+

To retrieve the CPU utilization of the compute host:
$ gnocchi measures show compute.node.cpu.percent --resource-id 52978e00-6322-5498-9c9a-40fc5dca9571
+---------------------------+-------------+-------+
| timestamp                 | granularity | value |
+---------------------------+-------------+-------+
| 2017-10-25T15:10:00+00:00 |       300.0 |  83.0 |
| 2017-10-25T15:15:00+00:00 |       300.0 |  17.4 |
| 2017-10-25T15:20:00+00:00 |       300.0 |  14.8 |
| 2017-10-25T15:25:00+00:00 |       300.0 |  15.5 |
+---------------------------+-------------+-------+


6. Default configurations from DevStack

/etc/gnocchi/gnocchi.conf :

[metricd]
metric_processing_delay = 5

[storage]
file_basepath = /opt/stack/data/gnocchi/
driver = file
coordination_url = redis://localhost:6379

[statsd]
user_id = XXXX
project_id = XXXX
resource_id = XXXX

[keystone_authtoken]
memcached_servers = 192.168.50.111:11211
signing_dir = /var/cache/gnocchi
cafile = /opt/stack/data/ca-bundle.pem
project_domain_name = Default
project_name = service
user_domain_name = Default
password = XXXX
username = gnocchi
auth_url = http://192.168.50.111/identity
auth_type = password

[api]
auth_mode = keystone

[indexer]
url = mysql+pymysql://root:XXXX@127.0.0.1/gnocchi?charset=utf8


/etc/ceilometer/ceilometer.conf :

[DEFAULT]
transport_url = rabbit://stackrabbit:XXXX@192.168.50.111:5672/

[oslo_messaging_notifications]
topics = notifications

[coordination]
backend_url = redis://localhost:6379

[notification]
pipeline_processing_queues = 2
workers = 2
workload_partitioning = True

[cache]
backend_argument = url:redis://localhost:6379
backend_argument = distributed_lock:True
backend_argument = db:0
backend_argument = redis_expiration_time:600
backend = dogpile.cache.redis
enabled = True

[service_credentials]
auth_url = http://192.168.50.111/identity
region_name = RegionOne
password = XXXX
username = ceilometer
project_name = service
project_domain_id = default
user_domain_id = default
auth_type = password

[keystone_authtoken]
memcached_servers = 192.168.50.111:11211
signing_dir = /var/cache/ceilometer
cafile = /opt/stack/data/ca-bundle.pem
project_domain_name = Default
project_name = service
user_domain_name = Default
password = XXXX
username = ceilometer
auth_url = http://192.168.50.111/identity
auth_type = password

/etc/ceilometer/polling.yaml :

---
sources:
    - name: all_pollsters
      interval: 120
      meters:
        - "*"

Tuesday 24 October 2017

Install DevStack on CentOS 7, without dependency errors

DevStack is a single machine based OpenStack version which can be simply used for any development on OpenStack. However, installing DevStack is not an easy process especially when you face a dependency error (as like any other dependency problem...). In this article, I will describe how to install DevStack on CentOS 7 without any dependency error.

1. Install CentOS 7.
Download the Minimal ISO version of CentOS 7 and install. If you install CentOS/DevStack on a virtual machine, my personal recommended settings are:
- CPU: 2 cores
- RAM: 4GB
- HDD: 10GB

2. Set up network
CentOS 7 uses NetworkManager by default, which means network can be set up by running 'nmtui' tool. If you don't want to use NetworkManager, disable it and use the traditional way by editing /etc/sysconfig/network-scripts/ifcfg-* files.

3. Install git and add user 'stack'.
First, install git software using yum:

# sudo yum install -y git

Then, add a non-root user 'stack' to run devstack. Details can be found on the official document:
https://docs.openstack.org/devstack/latest/#add-stack-user

Switch to the user 'stack' using su command:

# sudo su - stack

4. Clone DevStack git repository
Clone DevStack git repository using the following command:

# git clone https://git.openstack.org/openstack-dev/devstack
# cd devstack

5. Change to the recent stable branch *IMPORTANT*
Use the following command to change to the 'most recent' stable branch.

# git checkout stable/pike

Find the latest branch here:

When git repository is cloned, it syncs to the latest master branch of DevStack which might include some bugs resulting in installation error. I strongly suggest to use the latest 'stable' branch instead of master branch. Master branch includes all new features and functions which might cause a problem.

Also, do NOT use any OLD stable branches. Use only the most recent stable branch (stable/pike as of October 2017), because DevStack always selects the most recent yum package repository on their process regardless of the selected git branch.

6. Create local.conf
Create a local.conf file in devstack directory. The contents should be:

[[local|localrc]]
ADMIN_PASSWORD=secret
DATABASE_PASSWORD=$ADMIN_PASSWORD
RABBIT_PASSWORD=$ADMIN_PASSWORD
SERVICE_PASSWORD=$ADMIN_PASSWORD
HOST_IP=127.0.0.1

"secret" and the local ip address should be changed. Also, add extra features if needed. For example, if ceilometer is necessary, add the following line in the local.conf file.

enable_plugin ceilometer https://git.openstack.org/openstack/ceilometer stable/pike

7. Snapshot the virtual machine (Optional)
If you're installing DevStack on a VM, this is the best time to snapshot the VM because './stack.sh' script messes up all the configuration and package installation of our clean-installed CentOS 7.

8. Start the installation script

# ./stack

This command will start the installation of DevStack. The script uses python, pip, git, yum, and other tools to install DevStack, which means it simply changes all the package installation and dependencies based on DevStack git repository and the recent OpenStack release.

If this script ended up with any error, e.g. dependency error, version error, or failure of any module installation, then restore using the snapshot of Step 7, and then go back to Step 5 to check what branch is used for git repository. Make sure the git branch is changed to the MOST RECENT OpenStack release.

Saturday 23 September 2017

Galaxy S6 SM-G9200 China model - Installing Google Play Store and other apps (Open GApps)

G9200 is a China and Hong Kong variant model of Samsung Galaxy S6. Although Samsung has a same model number G9200 for both Chinese and HK markets, the firmware is different based on which region you bought the model.

G9200 HK has pre-installed Google Play Store and other Google apps as like other Android phones, whereas G9200 China Open has no access to any Google apps, including Google Play Store. For Chinese G9200, the only way to download and install Android apps is to use the Samsung's Galaxy Apps or to use other Chinese 3rd party app stores.

In this article, I discuss how I can enable Google Play Store and other services on G9200 China model. This is based on Android 6.0.1, but it would work on 7.0 too.

Install G9200 HK stock firmware using Odin? Failed!

Unlike other variants which is able to install other region's stock firmware without any hassle using Samsung's official Odin program, G9200 China model doesn't allow you to install HK version firmware on it. Odin basically recognizes G9200 China as a totally different model from G9200 HK.
When I tried to update G9200 HK firmware into G9200 China model through Odin, there is an error message on the G9200's download screen saying something like "SECURE CHECK FAIL : PIT". The error message basically indicates you cannot download this firmware onto the mobile phone because it's an incompatible model.

There is NO method to upload G9200 HK stock firmware to G9200 China device using Odin program. You might try to extract the PIT file from Chinese firmware and put into the HK firmware files, and then recreate a MD5 file for downloading. Be aware of playing with PIT file, because it is risky which can make your phone a brick.

Install OpenGApps using TWRP & Rooting

As the first method doesn't work, the only option is to install Google applications manually through TWRP recovery. Open GApps is an open source project providing a generic google apps installation method through  TWRP-installable zip files.
In order to use this method, you have to install TWRP recovery software substituting the stock recovery. In order to make TWRP working properly on Galaxy S6, you have to root the phone to disable some verification features from the stock kernel. This means, your warranty will be void once you install TWRP, or root the phone, so please follow it by your own risk. This is a step-by-step guide with troubleshootings.

1.  Install CROM from Galaxy Apps

CROM is a protection mechanism on Chinese Galaxy models which prohibit you to install any custom ROM on Galaxy phones. Before you proceed to install TWRP and rooting, CROM must be disabled through Samsung's official CROM app. This app can be found on Galaxy Apps (Samsung's default app store) or through APK installation which can be found from Google.
Once CROM Service App is installed, run it and disabled CROM service.

2. CF-Root through ODIN

CF-Root provides an easy rooting method for G9200 models using Odin. Simply download a CF-Root MD5 file from their official website: https://download.chainfire.eu/1109/CF-Root/CF-Auto-Root/CF-Auto-Root-zerofltechn-zerofltezc-smg9200.zip
Once downloaded, use Odin to download the CF-Root file into the mobile device. After rebooting, a new app called Super-SU will be shown on your mobile.

3. Install TWRP or other program to install GApps zip

For G9200 China model, finding a proper TWRP archive is not easy, because the model is so special that the official TWRP file for G9200 model is not really compatible with G9200 China. I guess the official TWRP is only for G9200 HK model. When I tried the official G9200 TWRP file (zerofltezt) on G9200 China model, the device was not booted up properly right after completing download.

Instead, find a specifically modified TWRP for G9200 China model named "G9200-PC1-TWRP-3.0-PC1(0324).tar". Google is always your friend to find a specific file online. Although it is for Android 6.0.1, it might work on 7.0 version. Give it a go, and re-flash your device to 6.0 if it doesn't work.

Make sure your mobile is already rooted and dm-verity is disabled. Otherwise, you can encounter an infinite rebooting once you get into TWRP recovery mode by pressing VolUp+Home+Power button.

Instead of using TWRP, there are alternative ways to install GApps zip file onto your /system directory. Find out Flashify app or manual way for more information.

If the device is not properly booting, you have to flash the stock firmware through Odin to recover the device. After flashing the stock image, do rooting first (Step 2) and then install TWRP.

Make sure you can enter TWRP recovery with the hot key (VolUp+Home+Power) and no problem for booting after reboot. You might encounter often the infinite rebooting issue after running TWRP recovery once, which is caused by the denial of booting from Samsung's stock firmware.

If the problem persistent, try this method:
- Flash the stock firmware image using Odin and reboot normally.
- Flash CF-Root using Odin and reboot normally. It will take some time for CF-Root for rooting.
- Uncheck "Auto re-boot" option in Odin, and flash TWRP. Do NOT reboot after the download.
- Reboot by pressing VolDown+Home+Power, and quickly press VolUp instead of VolDown, in order to boot into TWRP recovery mode.
- The point is to get into TWRP recovery mode first before booting into the normal system mode.
- Once TWRP recovery is loaded properly, reboot and see if the normal boot working fine.

4. Download Open GApps zip file

Download GApps zip file from the official OpenGApps website: http://opengapps.org/
Choose ARM64, your Android version, and a package variant. The smallest installation is 'pico' which installs simply Google Play Store and other basic files. If you want more Google features like face recognition, Google Now, or Gmail as a default application, check the package comparison and choose a right package. I think "pico" is enough for most users, and other apps can be downloaded later through Google Play Store.
Copy the downloaded zip file into your mobile to install later.

5. Delete files in /system directory to make more space

Google applications will be installed onto /system/ partition, but it has little available space. You have to remove some files from /system/ directory to make enough space for Google Apps.
Install Root Explorer or any rooting-enabled file managers by downloading APK file from internet, and delete some 'deletable' files in /system/preload using Root Explorer or equivalent. OpenGApss "pico" needs about 100~200MB space so just delete a few apps from preload. Another GApps package might take more space in which case you have to check the size of GApps first.


6. Install OpenGApps through TWRP or alternative method

Boot into TWRP recovery mode, choose the top-left menu for installing a zip file, and select the copied OpenGApp zip file.
If you choose not to install TWRP, use the alternative way to install OpenGApp zip file onto your /system partition.

Wipe cache/dalvik after installing the zip file.

7. Give all permissions to Google Play Store and Google Play Services.

Allow every permission for Google Play Store and Google Play Service apps in the application settings. If the proper permissions are not given, there will be many errors when opening Play Store such as a connection error, application stopped, unexpected termination, etc. Make sure all permissions are allowed for the two apps.

8. Enjoy Google Play Store!


Troubleshooting

Q. I don't want to install a custom recovery (TWRP). Any other method to install GApps zip file?
Flashify seems to be able to install Open GApps zip file onto the system partition without using a custom recovery. Try it and let us know if it works.

Q. TWRP is successfully installed, but touch screen is not working in TWRP recovery mode!
A. You installed wrong TWRP for another variant (e.g., for international G920F model). Find a correct TWRP file to install.

Q. After installing TWRP, mobile is turning off even before booting.
A. The TWRP installed is a wrong version (maybe for G9200 HK version). Find the one for G9200 China, and install again.

Q. Normal system booting is alright, but once entering TWRP recovery, mobile keeps rebooting.
A. Your stock firmware is refusing to boot up because it detects a non-samsung booting modules. Make sure the phone is rooted before TWRP installation. Also, boot into TWRP recovery mode first right after Odin flashing, before booting to normal, to enable TWRP to alter you stock firmware for bypassing the custom recovery detection.

Q. Google Play Store is installed, but there are errors to run it.
A. Delete cache/dalvik from recovery. Also, delete data and cache for both Google Play Store and Google Play Service on Application settings. Check the permission of both apps and allow every permission. If it still doesn't help, double check your Android version and install another package (nano or micro) from Open GApps.


Monday 24 July 2017

Integration of OpenDaylight (ODL) with OpenStack to manage OpenVSwitch (OVS)

neutron-openvswitch-agent

OpenStack basically uses its own neutron layer-2 agent plugin to manage network in clouds. "neutron-openvswitch-agent" is the most common L2 agent which creates an OpenVSwitch (OpenFlow-compatible software switch provided in Linux) on each compute node. When a new instance is created, nova communicates with neutron for network configuration, such as assigning IP address, adding a bridge, creating network tunnel, etc. Neutron-server on the controller node then communicates with neutron-openvswitch-agent on the compute node where the VM will be hosted to actually create a new port and tunnel for the VM.

In order to set up openvswitch-agent as L2 driver, the following configuration should be done in neutron.

1. Controller/Network node

Install:
# yum install openstack-neutron openstack-neutron-openvswitch
/etc/neutron/neutron.conf:
...
[DEFAULT]
core_plugin=neutron.plugins.ml2.plugin.Ml2Plugin
/etc/neutron/plugins/ml2/ml2_conf.ini:
...
[ml2]
mechanism_drivers =openvswitch
/etc/neutron/plugins/ml2/openvswitch.ini
tunnel_bridge = br-tun
local_ip = [LOCAL_IP]
bridge_mappings =extnet:br-ex  #br-ex interface should set up manually for external connection

Of course, there are much more configuration in addition to the above settings. In this article, we only care of the difference between OVS and ODL. For more details on how to setup neutron-openvswitch, check out OpenStack guides.

2. Compute node

Install openvswitch-agent:
# yum install openstack-neutron-openvswitch
/etc/neutron/plugins/ml2/openvswitch.ini
tunnel_bridge = br-tun
local_ip = [LOCAL_IP]
After all, both controller and compute nodes should have openvswitch-agent which communicates with the controller's neutron server.

OpenDaylight + OpenStack

Although OpenStack's default openvswitch plugin provides extensive functionality, it does not provide the full SDN functionality that can be brought by OpenFlow switches that connects physical hosts. As this L2 driver can only communicate with OVS in hypervisors, not with switches, it's not possible to manage the network. Also, those OVS in each compute node actually communicates only with its own local SDN controller (neutron-openvswitch-agent running on each node), OVS's are not managed globally by a central controller. Instead, neutron-server in the controller is in charge of managing every neutron-openvswitch-agent in each node which controls the local OVS.

If you want to empower your cloud with full SDN functionality, it's good idea to consider using a separate SDN controller to manage the whole network. OpenDaylight (ODL) is one of popular SDN controller and working closely with OpenStack for integration of ODL with OpenStack.

ODL can be running with OpenStack side by side. In contrast to openvswitch-agent where OVS in compute nodes is connected to the local agent, OVS are all connected to the ODL remotely. In this mode, ODL is functioning as L2 agent, thus the central ODL controller manages all OVS in every compute node. Since OVS is directly connected to ODL, neutron-openvswitch-agent is not necessary any more on all nodes.

In order to use ODL along with OpenStack, a specific L2 driver is necessary to allow OpenStack to communicate with ODL's NorthBound API. ODL and OpenStack team has made 'networking-odl' module in this purpose.

More detailed install instruction can be found in ODL site and OpenStack site.

1. Controller/network node

Uninstall:
   # yum uninstall openstack-neutron-openvswitch

Install:
   # yum install python-networking-odl

/etc/neutron/neutron.conf
   service_plugins = odl-router

/etc/neutron/plugins/ml2/ml2_conf.ini
   tenant_network_types=vxlan
   mechanism_drivers = opendaylight
   port_binding_controller = network-topology

Running agents:

  • neutron-dhcp-agent
  • neutron-metadata-agent
  • neutron-metering-agent

<Note 1>
opendaylight_v2 and odl-route_v2 can be used alternatively, which are for experimental development. For experimental usage, v2 is a good option as all new features are included. For stable usage, v1 is more recommended as you won't find more errors. Remind that these versions should be in pair, e.g. opendaylight_v2 cannot be used with odl-router.

<Note 2>
port_binding_controller setting is to determine how to get host configuration for binding a port. "network-topology" or "pseudo-agentdb-binding" can be used. The former is to use network topology which does not need an extra configuration. The latter is using OVS that contains 'hostconfigs', but it should be set up with "neutron-odl-ovs-hostconfig" command. It's by default "pseudo-agentdb-binding", but you will get error messages like these:

No valid hostconfigs in agentsdb for host
ERROR networking_odl.ml2.pseudo_agentdb_binding KeyError: 'hostconfig'
WARNING networking_odl.ml2.pseudo_agentdb_binding [-] ODL hostconfigs REST/GET failed, will retry on next poll

If so, set up a proper hostconfigs using 'neutron-odl-ovs-hostconfig' or change the 'port_binding_controller' setting.

<Note 3>
Neutron's L3-agent can be replaced by ODL. In such case, disable neutron-l3-agent and enable ODL's L3Fwd feature by changing "ovsdb.l3.fwd.enabled=yes" in ...karaf/etc/custom.properties file.

<Note 4>
In order to install networking-odl, I recommend to use yum instead of pip, as pip can mess up all Python dependencies that creates conflicts with OpenStack. Although the instruction guide recommends pip, it's good to consider yum especially if your other OpenStack components are installed by yum or PackStack. As PackStack uses yum, there will not be any dependency issue.


2. Compute node

Uninstall:
# yum uninstall openstack-neutron-openvswitch
Configuration:
# ovs-vsctl set-manager tcp:${CONTROL_HOST}:6640

Useful commands:
systemctl stop neutron-server
systemctl stop neutron-openvswitch-agent
systemctl stop openvswitch

Some tips...

  1. ODL(karaf) and networking-odl module is necessary only on controller node. On compute nodes, just change the management server of OVS to indicate ODL.2. 
  2. Don't confuse between openvswitch (OVS) and neutron-openvswitch-agent. OVS is a virtual switch provided by Linux kernel that mimics OpenFlow switch. 'neutron'-openvswitch-agent is the agent software used by OpenStack neutron in order to manage OVS in compute nodes.
  3. ODL karaf can be run as daemon. Search 'karaf daemon' for instructions.
  4. As mentioned above, those configurations explained above is a partial instruction missing a lot of information. Please refer to the full instruction guides to install ODL, setting up networking-odl,

Monday 29 May 2017

L2TP / IPSec setup guide for CentOS 7

Reference: http://blog.earth-works.com/2013/02/22/how-to-set-up-openswan-l2tp-vpn-server-on-centos-6/

OpenVPN is easy to set up, but needs an extra program installation on client side. On the other hand, L2TP /IPSec is implemented in most operating systems such as Windows 7/8/10, MacOS, several Linux distributions, Android, iOS, and so on, so that we can connect to L2TP/IPSec VPN out-of-the-box with most operating systems.

In this article, we explained how to install L2TP/IPSec server on CentOS 7 Linux distribution.

1. Install epel repository for extra features in CentOS. This is necessary for l2tpd installation.

# sudo yum -y install epel-release

2. Install necessary packages.

# yum install lsof man openswan xl2tpd

3. Design the network and think of the range of IP addresses. In this article, we use the following IP address ranges. Please note that we configure the VPN as the part of the current LAN network of the VPN server, thus VPN-connected clients will join the LAN network.

[Physical settings that already configured]

  • 192.168.0.0 / 24 : Physical network of LAN that resides VPN server
  • 192.168.0.1 : Physical IP address of VPN server. (Already set up)


[VPN networks using for VPN setting]

  • 192.168.0.201 : Local IP used by VPN server for L2TP tunnel. You can choose IP in the LAN network range.
  • 192.168.0.202-250 : Local IP range for VPN-connected clients.


4. Allow IP forwarding for NAT in /etc/sysctl.conf
# Controls IP packet forwarding
net.ipv4.ip_forward = 1

5. Reload sysctl to make the config effective
sysctl -p

7. /etc/rc.local
for each in /proc/sys/net/ipv4/conf/*; do
        echo 0 &gt; $each/accept_redirects
        echo 0 &gt; $each/send_redirects
        echo 0 &gt; $each/rp_filter
done

8. /etc/ipsec.conf
# /etc/ipsec.conf - Openswan IPsec configuration file
#
# Manual:     ipsec.conf.5
#
# Please place your own config files in /etc/ipsec.d/ ending in .conf
version       2.0    # conforms to second version of ipsec.conf specification
# basic configuration
config setup
       protostack=netkey
       plutostderrlog=/var/log/pluto.log
       interfaces="%defaultroute"
       plutodebug=none
       virtual_private=%v4:192.168.0.0/24
       nat_traversal=yes
conn L2TP-PSK
       authby=secret
       pfs=no
       auto=add
       keyingtries=3
       type=transport
       left="%defaultroute"
       leftprotoport=17/1701
       right=%any
       rightprotoport=17/0
       # Apple iOS doesn't send delete notify so we need dead peer detection
       # to detect vanishing clients
       dpddelay=10
       dpdtimeout=90
       dpdaction=clear
#You may put your configuration (.conf) file in the "/etc/ipsec.d/" and uncomment this.
#include /etc/ipsec.d/*.conf
9. Generate a key file to /etc/ipsec.secrets
ipsec newhostkey --output /etc/ipsec.secrets --verbose --configdir /etc/pki/nssdb/
10. Add the PSK key (shared between clients/server) at the end of /etc/ipsec.secrets
192.168.0.1      %any:     PSK     "yourPSKHere"

11. /etc/xl2tpd/xl2tpd.conf
[global]
listen-addr = 192.168.0.1
;
; requires openswan-2.5.18 or higher - Also does not yet work in combination
; with kernel mode l2tp as present in linux 2.6.23+
; ipsec saref = yes
; Use refinfo of 22 if using an SAref kernel patch based on openswan 2.6.35 or
;  when using any of the SAref kernel patches for kernels up to 2.6.35.
; ipsec refinfo = 30
;
; works around bug: http://bugs.centos.org/view.php?id=5832
force userspace = yes

;
[lns default]
ip range = 192.168.0.202-192.168.0.250
local ip = 192.168.0.201
; leave chap unspecified for maximum compatibility with windows, iOS, etc
; require chap = yes
refuse pap = yes
require authentication = yes
name = CentOSVPNserver
ppp debug = yes
pppoptfile = /etc/ppp/options.xl2tpd
length bit = yes

12. Update DNS server (ms-dns) on /etc/ppp/options.xl2tpd
ms-dns 8.8.8.8

13. Add ID/PW of users at /etc/ppp/chap-secrets
# client        server  secret                  IP addresses
user1           *       sgrongPassword1         *
user2           *       strongPassword2         *
13-1. Alternatively, use Linux's ID/PW for login. Follow the instructions on this article:
https://raymii.org/s/tutorials/IPSEC_L2TP_vpn_on_CentOS_-_Red_Hat_Enterprise_Linux_or_Scientific_-_Linux_6.html#Local_user_(PAM//etc/passwd)_authentication

14. Setup iptables
#Allow ipsec traffic
iptables -A INPUT -m policy --dir in --pol ipsec -j ACCEPT
iptables -A FORWARD -m policy --dir in --pol ipsec -j ACCEPT
#Do not NAT VPN traffic
iptables -t nat -A POSTROUTING -m policy --dir out --pol none -j MASQUERADE
#Forwarding rules for VPN
iptables -A FORWARD -i ppp+ -p all -m state --state NEW,ESTABLISHED,RELATED -j ACCEPT
iptables -A FORWARD -m state --state RELATED,ESTABLISHED -j ACCEPT
#Ports for Openswan / xl2tpd
iptables -A INPUT -m policy --dir in --pol ipsec -p udp --dport 1701 -j ACCEPT
iptables -A INPUT -p udp --dport 500 -j ACCEPT
iptables -A INPUT -p udp --dport 4500 -j ACCEPT
#Save your configuration
iptables save
15. Enable and start services
systemctl enable ipsec
systemctl start ipsed
systemctl enable xl2tpd
systemctl start xl2tpd

16. Configure clients (Windows/Mac/Linux/etc..)

  • Type of VPN: L2TP/IPSec
  • L2TP Security: choose pre-shared key for authentication. Put the PSK ("yourPSKHere") configured in Step 10.
  • ID/PW: ones set up in Step 13














Tuesday 7 February 2017

Samsung Galaxy S7 Duos (SM-G930FD) dual sim 4g + 3g mode

Galaxy S7 and S7 Edge have their dual sim models, so called S7 (Edge) Duos, selling in a few countries. If you got one of them with the model number SM-G930FD / SM-G935FD, your S7 is capable to connect to two separate networks at the same time. There are two card slots in the sim tray where two different nano sized sim cards can be placed.

However, for some S7 Duos models (especially selling in India or Middle East), 4g+3g dual mode is not working and forced to change to 4g+2g mode only. While the main sim card can connect to 4g or 3g network with data enabled, the sub sim card can connect to 2g only and not allowed to switch to 3g.

This is because Samsung banned the 2nd sim to connect to 3g network, as 3g was regarded as a data network which would conflict with the main sim's 4g (or 3g) data network connection. However, in these days many countries use 3g network as their main connection carrying not only data but also voice and SMS text messages. For those countries and carriers where 2g network (GSM, GPRS or EDGE) is not available, this can be troublesome as the 2nd sim card can connect to 3g network, which makes the dual sim functionality useless.

If your S7 Duos is not able to turn the 2nd sim to 3g mode and only working in 4g+2g mode, it can be fixed by updating the firmware to different country. As far as I tested, the recent G930FD firmware for Thailand works perfectly without any issue on 4g+3g dual mode. You might want to update your firmware to the latest version and test 4g+3g mode again before changing the firmware, because the issue is sometimes fixed on the latest version.

Be aware that this method needs factory reset and changing CSC code (country and carrier code) of the firmware. If your current CSC code is in multi-CSC and not a default CSC code in the multi-CSC, it will be difficult to return the CSC code back to your original one after changing CSC.

Changing S7's firmware to one of Samsung's official one using ODIN is simple and not tripping Knox. If you want to change the firmware back to the original, it is always possible to return like as it was, except for the case that your original CSC is in multi-CSC.

You can use ODIN to wipe your firmware to another version. Follow the simple steps.
1) Download the latest ODIN program and Samsung USB driver package, and install them on your Windows computer.
2) Download the latest Thailand firmware of S7 Duos (SM-G930FD / G935FD) from SamMobile.com or SamFirm application.
3) Back up all the data in your mobile
4) Turn off the mobile, and turn on again by pushing VolDown + Home + Power key simultaneously.
5) Phone is booted in Downloading mode, press VolUp key to enter the downloading ready mode.
6) Open ODIN, put the downloaded firmware into the ODIN (BL+AP+CP+CSC). Make sure the CSC selection is the CSC file without HOME_ prefix. FYI, HOME_CSC is used only for updating from the same firmware variant which will not need delete all the data.
7) Connect S7 to the computer with USB cable. Connection is detected in ODIN.
8) Start download. Do not change any options in ODIN (only Auto Reboot, F.Reset Time checked, all others unchecked).
9) Once the firmware is downloaded, the phone will be reset and takes a while to boot up.
10) (IMPOTANT!!) An extra factory reset is necessary to make sure your CSC is changed to THL (Thailand version) before using your mobile. Once the phone is booted up and showing the welcome message, just turn it off again, and turn on by pressing VolUP + Home + Power key for a while booting in Recovery Mode. On the Recovery menu, select 'wipe data/factory reset' and 'wipe cache partition', then reboot again.
11) Now, your S7 is entirely in Thailand firmware, which means 4g+3g dual mode is working perfectly.

More details can be found on internet. There are so many guides online explaining how to update S7 firmware using ODIN. Search for S7 ODIN .

Note that 4g+3g dual mode is working only on Thailand firmware with THL CSC, even after 7.0 update. If your CSC is not THL, 4g+3g is not working correctly no matter what your firmware is. This is confirmed as of February 2017, and I wish Samsung will notice and patch it across all regions soon.

Android Battery Drain issue - How to dig and find the root cause?

Mobile phones is getting more and more powerful silicons and processors, which causes more and more issues on battery management. It is unav...